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ABSTRACT Pairwise sequence comparison methods have 
been assessed using proteins whose relationships are known 
reliably from their structures and functions, as described in 
the SCOP database [Murzin, A. G., Brenner, S. E., Hubbard, T. 
& Chothia C. (1995) /. Mol. Biol. 247, 536-540]. The evalua- 
tion tested the programs BLAST [Altschul, S. F., Gish, W., 
Miller, W., Myers, E. W. & Lipman, D. J. (1990)./. Mol. Biol. 
215, 403-410], WU-BLAST2 [Altschul, S. F. & Gish, W. (1996) 
Methods Enzymol. 266, 460-480], FASTA [Pearson, W. R. & 
Lipman, D. J. (1988) Proc. Natl. Acad. Sci. USA 85, 2444-2448], 
and s SEARCH [Smith, T. F. & Waterman, M. S. (1981) /. Mol. 
Biol. 147, 195-197] and their scoring schemes. The error rate 
of all algorithms is greatly reduced by using statistical scores 
to evaluate matches rather than percentage identity or raw 
scores. The E- value statistical scores of S SEARCH and fasta are 
reliable: the number of false positives found in our tests agrees 
well with the scores reported. However, the P-values reported 
by BLAST and WU-BLAST2 exaggerate significance by orders of 
magnitude, ssearch, fasta ktup = 1, and WU-BLAST2 perform 
best, and they are capable of detecting almost all relationships 
between proteins whose sequence identities are >30%. For 
more distantly related proteins, they do much less well; only 
one-half of the relationships between proteins with 20-30% 
identity are found. Because many homologs have low sequence 
similarity, most distant relationships cannot be detected by 
any pairwise comparison method; however, those which are 
identified may be used with confidence. 



Sequence database searching plays a role in virtually every 
branch of molecular biology and is crucial for interpreting the 
sequences issuing forth from genome projects. Given the 
method's central role, it is surprising that overall and relative 
capabilities of different procedures are largely unknown. It is 
difficult to verify algorithms on sample data because this 
requires large data sets of proteins whose evolutionary rela- 
tionships are known unambiguously and independently of the 
methods being evaluated. However, nearly all known ho- 
mologs have been identified by sequence analysis (the method 
to be tested). Also, it is generally very difficult to know, in the 
absence of structural data, whether two proteins that lack clear 
sequence similarity are unrelated. This has meant that al- 
though previous evaluations have helped improve sequence 
comparison, they have suffered from insufficient, imperfectly 
characterized, or artificial test data. Assessment also has been 
problematic because high quality database sequence searching 
attempts to have both sensitivity (detection of homologs) and 
specificity (rejection of unrelated proteins); however, these 
complementary goals are linked such that increasing one 
causes the other to be reduced. 
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Sequence comparison methodologies have evolved rapidly, 
so no previously published tests has evaluated modern versions 
of programs commonly used. For example, parameters in 
BLAST (1) have changed, and WU-BLAST2 (2) — which produces 
gapped alignments — has become available. The latest version 
of fasta (3) previously tested was 1.6, but the current release 
(version 3.0) provides fundamentally different results in the 
form of statistical scoring. 

The previous reports also have left gaps in our knowledge. 
For example, there has been no published assessment of 
thresholds for scoring schemes more sophisticated than per- 
centage identity. Thus, the widely discussed statistical scoring 
measures have never actually been evaluated on large data- 
bases of real proteins. Moreover, the different scoring schemes 
commonly in use have not been compared. 

Beyond these issues, there is a more fundamental question: 
in an absolute sense, how well does pairwise sequence com- 
parison work? That is, what fraction of homologous proteins 
can be detected using modern database searching methods? 

In this work, we attempt to answer these questions and to 
overcome both of the fundamental difficulties that have hin- 
dered assessment of sequence comparison methodologies. 
First, we use the set of distant evolutionary relationships in the 
SCOP: Structural Classification of Proteins database (4), which 
is derived from structural and functional characteristics (5). 
The SCOP database provides a uniquely reliable set of ho- 
mologs, which are known independently of sequence compar- 
ison. Second, we use an assessment method that jointly mea- 
sures both sensitivity and specificity. This method allows 
straightforward comparison of different sequence searching 
procedures. Further, it can be used to aid interpretation of real 
database searches and thus provide optimal and reliable 
results. 

Previous Assessments of Sequence Comparison. Several 
previous studies have examined the relative performance of 
different sequence comparison methods. The most encom- 
passing analyses have been by Pearson (6, 7), who compared 
the three most commonly used programs. Of these, the Smith- 
Waterman algorithm (8) implemented in ssearch (3) is the 
oldest and slowest but the most rigorous. Modern heuristics 
have provided BLAST (1) the speed and convenience to make 
it the most popular program. Intermediate between these two 
is FASTA (3), which may be run in two modes offering either 
greater speed (ktup = 2) or greater effectiveness (ktup = 1). 
Pearson also considered different parameters for each of these 
programs. 

To test the methods, Pearson selected two representative 
proteins from each of 67 protein superfamilies defined by the 
PIR database (9). Each was used as a query to search the 
database, and the matched proteins were marked as being 
homologous or unrelated according to their membership of PIR 
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superfamilies. Pearson found that modern matrices and "In- 
scaling" of raw scores improve results considerably. He also 
reported that the rigorous Smith-Waterman algorithm worked 
slightly better than FASTA, which was in turn more effective 
than blast. 

Very large scale analyses of matrices have been performed 
(10), and Henikoff and Henikoff (11) also evaluated the 
effectiveness of BLAST and FASTA. Their test with BLAST 
considered the ability to detect homologs above a predeter- 
mined score but had no penalty for methods which also 
reported large numbers of spurious matches. The Henikoffs 
searched the swiss-prot database (12) and used prosite (13) 
to define homologous families. Their results showed that the 
BLOSUM62 matrix (14) performed markedly better than the 
extrapolated PAM-series matrices (15), which previously had 
been popular. 

A crucial aspect of any assessment is the data that are used 
to test the ability of the program to find homologs. But in 
Pearson's and the Henikoffs* evaluations of sequence com- 
parison, the correct results were effectively unknown. This is 
because the superfamilies in PIR and PROSITE are principally 
created by using the same sequence comparison methods 
which are being evaluated. Interdependency of data and 
methods creates a "chicken and egg" problem, and means for 
example, that new methods would be penalized for correctly 
identifying homologs missed by older programs. For instance, 
immunoglobulin variable and constant domains are clearly 
homologous, but pir places them in different superfamilies. 
The problem is widespread: each superfamily in PIR 48.00 with 
a structural homolog is itself homologous to an average of 1.6 
other pir superfamilies (16). 

To surmount these sorts of difficulties, Sander and Schnei- 
der (17) used protein structures to evaluate sequence com- 
parison. Rather than comparing different sequence compari- 
son algorithms, their work focused on determining a length- 
dependent threshold of percentage identity, above which al! 
proteins would be of similar structure. A result of this analysis 
was the Hssp equation; it states that proteins with 25% identity 
over 80 residues will have similar structures, whereas shorter 
alignments require higher identity. (Other studies also have 
used structures (18-20), but these focused on a small number 
of model proteins and were principally oriented toward eval- 
uating alignment accuracy rather than homology detection.) 

A general solution to the problem of scoring comes from 
statistical measures (i.e., E-values and P-values) based on the 
extreme value distribution (21). Extreme value scoring was 
implemented analytically in the BLAST program using the 
Karlin and Altschul statistics (22, 23) and empirical ap- 
proaches have been recently added to FASTA and SSEARCH. In 
addition to being heralded as a reliable means of recognizing 
significantly similar proteins (24, 25), the mathematical trac- 
tability of statistical scores "is a crucial feature of the BLAST 
algorithm" (1). The validity of this scoring procedure has been 
tested analytically and empirically (see ref. 2 and references in 
ref. 24). However, all large empirical tests used random 
sequences that may lack the subtle structure found within 
biological sequences (26, 27) and obviously do not contain any 
real homologs. Thus, although many researchers have sug- 
gested that statistical scores be used to rank matches (24, 25, 
28), there have been no large rigorous experiments on biolog- 
ical data to determine the degree to which such rankings are 
superior. 

A Database for Testing Homology Detection. Since the 
discovery that the structures of hemoglobin and myoglobin are 
very similar though their sequences are not (29), it has been 
apparent that comparing structures is a more powerful (if less 
convenient) way to recognize distant evolutionary relation- 
ships than comparing sequences. If two proteins show a high 
degree of similarity in their structural details and function, it 



is very probable that they have an evolutionary relationship 
though their sequence similarity may be low. 

The recent growth of protein structure information com- 
bined with the comprehensive evolutionary classification in 
the scop database (4, 5) have allowed us to overcome previous 
limitations. With these data, we can evaluate the performance 
of sequence comparison methods on real protein sequences 
whose relationships are known confidently. The SCOP database 
uses structural information to recognize distant homologs, the 
large majority of which can be determined unambiguously. 
These superfamilies, such as the globins or the immunoglobu- 
lins, would be recognized as related by the vast majority of the 
biological community despite the lack of high sequence sim- 
ilarity. 

From SCOP, we extracted the sequences of domains of 
proteins in the Protein Data Bank (PDB) (30) and created two 
databases. One (PDB90D-B) has domains, which were all <90% 
identical to any other, whereas (PDB40D-B) had those <40% 
identical. The databases were created by first sorting all 
protein domains in SCOP by their quality and making a list. The 
highest quality domain was selected for inclusion in the 
database and removed from the list. Also removed from the list 
(and discarded) were all other domains above the threshold 
level of identity to the selected domain. This process was 
repeated until the list was empty. The PDB40D-B database 
contains 1,323 domains, which have 9,044 ordered pairs of 
distant relationships, or «0.5% of the total 1,749,006 ordered 
pairs. In PDB90D-B, the 2,079 domains have 53,988 relation- 
ships, representing 1.2% of all pairs. Low complexity regions 
of sequence can achieve spurious high scores, so these were 
masked in both databases by processing with the SEG program 
(27) using recommended parameters: 12 1.8 2.0. The databases 
used in this paper are available from http://sss.stanford.edu/ 
sss/, and databases derived from the current version of SCOP 
may be found at http://scop.mrc-lmb.cam.ac.uk/scop/. 

Analyses from both databases were generally consistent, but 
PDB40D-B focuses on distantly related proteins and reduces the 
heavy overrepresentation in the PDB of a small number of 
families (31, 32), whereas PDB90D-B (with more sequences) 
improves evaluations of statistics. Except where noted other- 
wise, the distant homolog results here are from PDB40D-B. 
Although the precise numbers reported here are specific to the 
structural domain databases used, we expect the trends to be 
general. 

Assessment Data and Procedure. Our assessment of se- 
quence comparison may be divided into four different major 
categories of tests. First, using just a single sequence compar- 
ison algorithm at a time, we evaluated the effectiveness of 
different scoring schemes. Second, we assessed the reliability ■ 
of scoring procedures, including an evaluation of the validity 
of statistical scoring. Third, we compared sequence compari- 
son algorithms (using the optimal scoring scheme) to deter- 
mine their relative performance. Fourth, we examined the 
distribution of homologs and considered the power of pairwise 
sequence comparison to recognize them. All of the analyses 
used the databases of structurally identified homologs and a 
new assessment criterion. 

The analyses tested blast (1), version 1.4.9MP, and wu- 
BLAST2 (2), version 2.0a 13MP. Also assessed was the fasta 
package, version 3.0t76 (3), which provided FASTA and the 
SSEARCH implementation of Smith-Waterman (8). For 
ssearch and fasta, we used BLOSUM45 with gap penalties 
-12/-1 (7, 16). The default parameters and matrix (BLO- 
SUM62) were used for blast and wu-blast2. 

The "Coverage Vs. Error" Plot. To test a particular protocol 
(comprising a program and scoring scheme), each sequence 
from the database was used as a query to search the database. 
This yielded ordered pairs of query and target sequences with 
associated scores, which were sorted, on the basis of their 
scores, from best to worst. The ideal method would have 
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Fig. 1. Coverage vs. error plots of different scoring schemes for ssearch Smith- Waterman. (A) Analysis of PDB40D-B database. (B) Analysis 
of PDB90D-B database. All of the proteins in the database were compared with each other using the ssearch program. The results of this single 
set of comparisons were considered using five different scoring schemes and assessed. The graphs show the coverage and errors per query (EPQ) 
for statistical scores, raw scores, and three measures using percentage identity. In the coverage vs. error plot, the x axis indicates the fraction of 
all homologs in the database (known from structure) which have been detected. Precisely, it is the number of detected pairs of proteins with the 
same fold divided by the total number of pairs from a common super family. PDB40D-B contains a total of 9,044 homologs, so a score of 10% indicates 
identification of 904 relationships. The y axis reports the number of EPQ. Because there are 1,323 queries made in the PDB40D-B ail-vs.-all 
comparison, 13 errors corresponds to 0.01, or 1% EPQ. They axis is presented on a log scale to show results over the widely varying degrees of 
accuracy which may be desired. The scores that correspond to the levels of EPQ and coverage are shown in Fig. 4 and Table 1. The graph 
demonstrates the trade-off between sensitivity and selectivity. As more homologs are found (moving to the right), more errors are made (moving 
up). The ideal method would be in the lower right corner of the graph, which corresponds to identifying many evolutionary relationships without 
selecting unrelated proteins. Three measures of percentage identity are plotted. Percentage identity within alignment is the degree of identity within 
the aligned region of the proteins, without consideration of the alignment length. Percentage identity within both is the number of identical residues 
in the aligned region as a percentage of the average length of the query and target proteins. The hssp equation (17) is H = 290.15/" 0562 where 
/ is length for 10 < / < 80; H > 100 for / < 10; H = 24.7 for / > 80. The percentage identity HSSP-adjusted score is the percent identity within 
the alignment minus H. Smith-Waterman raw scores and E-values were taken directly from the sequence comparison program. 



perfect separation, with all of the homologs at the top of the 
list and unrelated proteins below. In practice, perfect separa- 
tion is impossible to achieve so instead one is interested in 
drawing a threshold above which there are the largest number 
of related pairs of sequences consistent with an acceptable 
error rate. 

Our procedure involved measuring the coverage and error 
for every threshold. Coverage was defined as the fraction of 
structurally determined homologs that have scores above the 
selected threshold; this reflects the sensitivity of a method. 
Errors per query (EPQ), an indicator of selectivity, is the 
number of nonhomologous pairs above the threshold divided 
by the number of queries. Graphs of these data, called 
coverage vs. error plots, were devised to understand how 



protocols compare at different levels of accuracy. These 
graphs share effectively all of the beneficial features of Re- 
ciever Operating Characteristic (ROC) plots (33, 34) but 
better represent the high degrees of accuracy required in 
sequence comparison and the huge background of nonho- 
mologs. 

This assessment procedure is directly relevant to practical 
sequence database searching, for it provides precisely the 
information necessary to perform a reliable sequence database 
search. The EPQ measure places a premium on score consis- 
tency; that is, it requires scores to be comparable for different 
queries. Consistency is an aspect which has been largely 
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Fig. 2. Unrelated proteins with high percentage identity. Hemo- 
globin 0-chain (pdb code lhds chain b, ref. 38, Left) and cellulase E2 
(pdb code ltml, ref. 39, Right) have 39% identity over 64 residues, a 
level which is often believed to be indicative of homology. Despite this 
high degree of identity, their structures strongly suggest that these 
proteins are not related. Appropriately, neither the raw alignment 
score of 85 nor the E-value of 1.3 is significant. Proteins rendered by 
RASMOL (40). 
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Fig. 3. Length and percentage identity of alignments of unrelated 
proteins in PDB90D-B: Each pair of nonhomologous proteins found with 
ssearch is plotted as a point whose position indicates the length and 
the percentage identity within the alignment. Because alignment 
length and percentage identity are quantized, many pairs of proteins 
may have exactly the same alignment length and percentage identity. 
The line shows the hssp threshold (though it is intended to be applied 
with a different matrix and parameters). 
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Fig. 4. Reliability of statistical scores in PDB90D-B: Each line shows 
the relationship between reported statistical score and actual error 
rate for a different program. E-values are reported for ssearch and 
fasta, whereas P-vaiues are shown for blast and wu-blast2. If the 
scoring were perfect, then the number of errors per query and the 
E-values would be the same, as indicated by the upper bold line. 
(P-values should be the same as EPQ for small numbers, and diverges 
at higher values, as indicated by the lower bold line.) E-values from 
ssearch and fasta are shown to have good agreement with EPQ but 
underestimate the significance slightly. BLAST and WU-BLAST2 are 
overconfident, with the degree of exaggeration dependent upon the 
score. The results for PDB40D-B were similar to those for PDB90D-B 
despite the difference in number of homologs detected. This graph 
could be used to roughly calibrate the reliability of a given statistical 
score. 

ignored in previous tests but is essential for the straightforward 
or automatic interpretation of sequence comparison results. 
Further, it provides a clear indication of the confidence that 
should be ascribed to each match. Indeed, the EPQ measure 
should approximate the expectation value reported by data- 
base searching programs, if the programs' estimates are accu- 
rate. 

The Performance of Scoring Schemes. All of the programs 
tested could provide three fundamental types of scores. The 
first score is the percentage identity, which may be computed 
in several ways based on either the length of the alignment or 
the lengths of the sequences. The second is a "raw" or 
"Smith-Waterman" score, which is the measure optimized by 
the Smith-Waterman algorithm and is computed by summing 
the substitution matrix scores for each position in the align- 
ment and subtracting gap penalties. In BLAST, a measure 
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related to this score is scaled into bits. Third is a statistical 
score based on the extreme value distribution. These results 
are summarized in Fig. 1. 

Sequence Identity. Though it has been long established that 
percentage identity is a poor measure (35), there is a common 
rule-of-thumb stating that 30% identity signifies homology. 
Moreover, publications have indicated that 25% identity can 
be used as a threshold (17, 36). We find that these thresholds, 
originally derived years ago, are not supported by present 
results. As databases have grown, so have the possibilities for 
chance alignments with high identity; thus, the reported cutoffs 
lead to frequent errors. Fig. 2 shows one of the many pairs of 
proteins with very different structures that nonetheless have 
high levels of identity over considerable aligned regions. 
Despite the high identity, the raw and the statistical scores for 
such incorrect matches are typically not significant. The prin- 
cipal reasons percentage identity does so poorly seem to be 
that it ignores information about gaps and about the conser- 
vative or radical nature of residue substitutions. 

From the PDB90D-B analysis in Fig. 3, we learn that 30% 
identity is a reliable threshold for this database only for 
sequence alignments of at least 150 residues. Because one 
unrelated pair of proteins has 43.5% identity over 62 residues, 
it is probably necessary for alignments to be at least 70 residues 
in length before 40% is a reasonable threshold, for a database 
of this particular size and composition. 

At a given reliability, scores based on percentage identity 
detect just a fraction of the distant homologs found by 
statistical scoring. If one measures the percentage identity in 
the aligned regions without consideration of alignment length, 
then a negligible number of distant homologs are detected. 
Use of the hssp equation improves the value of percentage 
identity, but even this measure can find only 4% of all known 
homologs at 1% EPQ. In short, percentage identity discards 
most of the information measured in a sequence comparison. 

Raw Scores. Smith-Waterman raw scores perform better 
than percentage identity (Fig. 1), but In-scaling (7) provided no 
notable benefit in our analysis. It is necessary to be very precise 
when using either raw or bit scores because a 20% change in 
cutoff score could yield a tenfold difference in EPQ. However, 
it is difficult to choose appropriate thresholds because the 
reliability of a bit score depends on the lengths of the proteins 
matched and the size of the database. Raw score thresholds 
also are affected by matrix and gap parameters. 

Statistical Scores. Statistical scores were introduced partly 
to overcome the problems that arise from raw scores. This 
scoring scheme provides the best discrimination between 
homologous proteins and those which are unrelated. Most 
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Fig. 5. Coverage vs. error plots of different sequence comparison methods: Five different sequence comparison methods are evaluated, each 
using statistical scores (E- or P-values). (A) PDB40D-B database. In this analysis, the best method is the slow ssearch, which finds 18% of relationships 
at 1% EPQ. fasta ktup = 1 and wu-blast2 are almost as good. (£) PDB90D-B database. The quick wu-blasT2 program provides the best coverage 
at 1% EPQ on this database, although at higher levels of error it becomes slightly worse than fasta ktup = 1 and ssearch. 



Biochemistry: Brenner et al 



Proc. Natl Acad. Sci. USA 95 (1998) 6077 



likely, its power can be attributed to its incorporation of more 
information than any other measure; it takes account of the 
full substitution and gap data (like raw scores) but also has 
details about the sequence lengths and composition and is 
scaled appropriately. 

We find that statistical scores are not only powerful, but also 
easy to interpret. SSEARCH and FASTA show close agreement 
between statistical scores and actual number of errors per 
query (Fig. 4). The expectation value score gives a good, 
slightly conservative estimate of the chances of the two se- 
quences being found at random in a given query. Thus, an 
E-value of 0.01 indicates that roughly one pair of nonhomologs 
of this similarity should be found in every 100 different queries. 
Neither raw scores nor percentage identity can be interpreted 
in this way, and these results validate the suitability of the 
extreme value distribution for describing the scores from a 
database search. 

The P-values from BLAST also should be directly interpret- 
able but were found to overstate significance by more than two 
orders of magnitude for 1% EPQ for this database. Nonethe- 
less, these results strongly suggest that the analytic theory is 
fundamentally appropriate. WU-BLAST2 scores were more re- 
liable than those from BLAST, but also exaggerate expected 
confidence by more than an order of magnitude at 1% EPQ. 

Overall Detection of Homologs and Comparison of Algo- 
rithms. The results in Fig. 5A and Table 1 show that pairwise 
sequence comparison is capable of identifying only a small 
fraction of the homologous pairs of sequences in PDB40D-B. 
Even SSEARCH with E-values, the best protocol tested, could 
find only 18% of all relationships at a 1% EPQ. BLAST, which 
identifies 15%, was the worst performer, whereas FASTA 
ktup = 1 is nearly as effective as SSEARCH. FASTA ktup - 2 and 
WU-BLAST2 are intermediate in their ability to detect ho- 
mologs. Comparison of different algorithms indicates that 
those capable of identifying more homologs are generally 
slower. SSEARCH is 25 times slower than blast and 6.5 times 
slower than FASTA ktup - 1. WU-BLAST2 is slightly faster than 
FASTA ktup = 2, but the latter has more interpretable scores. 

In PDB90D-B, where there are many close relationships, the 
best method can identify only 38% of structurally known 
homologs (Fig. 5B). The method which finds that many 
relationships is WU-BLAST2. Consequently, we infer that the 
differences between fasta kup = 1, SSEARCH, and wu-BLAST2 
programs are unlikely to be significant when compared with 
variation in database composition and scoring reliability. 

Fig. 6 helps to explain why most distant homologs cannot be 
found by sequence comparison: a great many such relation- 
ships have no more sequence identity than would be expected 
by chance. SSEARCH with E-values can recognize >90% of the 
homologous pairs with 30-40% identity. In this region, there 
are 30 pairs of homologous proteins that do not have signif- 
icant E-values, but 26 of these involve sequences with <50 
residues. Of sequences having 25-30% identity, 75% are 
identified by SSEARCH E-values. However, although the num- 
ber of homologs grows at lower levels of identity, the detection 
falls off sharply: only 40% of homologs with 20-25% identity 
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Fig. 6. Distribution and detection of homologs in PDB40D-B. Bars 
show the distribution of homologous pairs PDB40D-B according to their 
identity (using the measure of identity in both). Filled regions indicate 
the number of these pairs found by the best database searching method 
(ssearch with E-values) at 1% EPQ. The PDB40D-B database contains 
proteins with <40% identity, and as shown on this graph, most 
structurally identified homologs in the database have diverged ex- 
tremely far in sequence and have <20% identity. Note that the 
alignments may be inaccurate, especially at low levels of identity. Filled 
regions show that ssearch can identify most relationships that have 
25% or more identity, but its detection wanes sharply below 25%. 
Consequently, the great sequence divergence of most structurally 
identified evolutionary relationships effectively defeats the ability of 
pariwise sequence comparison to detect them. 

are detected and only 10% of those with 15-20% can be found. 
These results show that statistical scores can find related 
proteins whose identity is remarkably low; however, the power 
of the method is restricted by the great divergence of many 
protein sequences. 

After completion of this work, a new version of pairwise 
BLAST was released: BLASTGP (37). It supports gapped align- 
ments, like WU-BLAST2, and dispenses with sum statistics. Our 
initial tests on BLASTGP using default parameters show that its 
E-values are reliable and that its overall detection of homologs 
was substantially better than that of ungapped blast, but not 
quite equal to that of WU-BLAST2. 

CONCLUSION 

The general consensus amongst experts (see refs. 7, 24, 25, 27 
and references therein) suggests that the most effective se- 
quence searches are made by (/) using a large current database 
in which the protein sequences have been complexity masked 
and (h) using statistical scores to interpret the results. Our 
experiments fully support this view. 

Our results also suggest two further points. First, the E-val- 
ues reported by FASTA and SSEARCH give fairly accurate 
estimates of the significance of each match, but the P-values 
provided by BLAST and WU-BLAST2 underestimate the true 



Table 1. Summary of sequence comparison methods with PDB40D-B 



Method 


Relative Time* 


1% EPQ Cutoff 


Coverage at 1% EPQ 


ssearch % identity: within alignment 


25.5 


>70% 


<0.1 


ssearch % identity: within both 


25.5 


34% 


3.0 


ssearch % identity: HSSP-scaled 


25.5 


35% (hssp + 9.8) 


4.0 


ssearch Smith- Water man raw scores 


25.5 


142 


10.5 


ssearch E-values 


25.5 


0.03 


18.4 


fasta ktup = 1 E-values 


3.9 


0.03 


17.9 


fasta ktup = 2 E-values 


1.4 


0.03 


16.7 


wu-blasT2 P-values 


1.1 


0.003 


17.5 


blast P-values 


1.0 


0.00016 


14.8 


*Times are from large database searches with genome proteins. 
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extent of errors. Second, SSEARCH, WU-BLAST2, and FASTA 
ktup = 1 perform best, though blast and Fasta ktup = 2 
detect most of the relationships found by the best procedures 
and are appropriate for rapid initial searches. 

The homologous proteins that are found by sequence com- 
parison can be distinguished with high reliability from the huge 
number of unrelated pairs. However, even the best database 
searching procedures tested fail to find the large majority of 
distant evolutionary relationships at an acceptable error rate. 
Thus, if the procedures assessed here fail to find a reliable 
match, it does not imply that the sequence is unique; rather, it 
indicates that any relatives it might have are distant ones.** 



** Additional and updated information about this work, including 
supplementary figures, may be found at http://sss.stanford.edu/sss/. 
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ABSTRACT: 

This article is an updated report of a symposium held at the June 
2000 annual meeting of the American Society for Pharmacology 
and Experimental Therapeutics in Boston. The symposium was 
sponsored by the ASPET Divisions for Drug Metabolism and Mo- 
lecular Pharmacology. The report covers research from the au- 



thors' laboratories on the structure and regulation of UDP-glucu- 
ronosyltransferase (UGT) genes, glucuronidation of xenobiotics 
and endobiotics, the toxicological relevance of UGTs, the role of 
UGT polymorphisms in cancer susceptibility, and gene therapy for 
UGT deficiencies. 



For most xenobiotics and many endobiotics, glucuronidation con- 
stitutes a major route of elimination and thereby may substantially 
modulate substrate concentrations and effects. In some cases, glucu- 
ronidation forms the biologically active molecule. Recent studies have 
revealed an extensive superfamily of UDP-glucuronosyltransferases 
(UGTs), 2 previously termed glucuronyl transferases, which catalyze 
the conjugation of UDP-glucuronic acid with lipid-soluble substrates 
to form polar conjugates that are excreted in the urine and feces. These 
studies have provided fundamental insights into UGT gene structure 
and regulation, isozyme substrate selectivity, and interindividual vari- 
ability. Whereas there remains much to learn about the potential 
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biological relevance of UGTs, deficient glucuronidation can result in 
either elevated tissue concentrations and direct toxicity of substrates, 
as with the endobiotic bilirubin, or, alternatively, enhanced bioacti- 
vation of substrates to toxic reactive intermediates, as in the case of 
acetaminophen and benzo[a]pyrene. Interindividual UGT variability 
likely plays an important role in drug efficacy and xenobiotic toxicity, 
as well as in hormonal regulation and certain diseases, which in some 
cases may be amenable to therapeutic manipulations including gene 
therapy. 

Structure and Tissue-Specific Regulation of UGT Genes 
(P.I.M., P.A.G., Y.I., A.J.H.) 

The UGT content of cells and tissues is a major determinant of our 
response to those chemicals that are primarily eliminated by conju- 
gation with glucuronic acid. There are marked interindividual differ- 
ences in the content of UGTs in the liver and other organs including 
the gastrointestinal tract. For example, only one third of the popula- 
tion appears to express UGT1A1, UGT 1 A3, and UGT1A6 in their 
gastric epithelium (Strassburg et al., 1998). Studies on the mecha- 
nisms that regulate UGT genes, in a temporal and tissue-specific 
manner, should contribute significantly to understanding the basis for 
these differences. Such studies should also aid in the design of 
molecular probes to assess the capacity of individuals to metabolize 
specific drugs and toxins, before exposure to these agents. 

The genes encoding UGTs that use UDP-glucuronic acid as sugar 
donor have been assigned to two families (Mackenzie et al., 1997). 
The UGT] family constitutes a complex gene locus on human chro- 
mosome 2q37 and comprises 13 first exons that encode the unique 
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N-terminal domains of the UGT1A proteins and exons 2 to 5 that 
encode the C-terminal domain, which is identical in all UGT1A 
family members (Owens and Ritter, 1992). The UGT2 family, in 
contrast, is encoded by separate genes clustered on chromosome 4ql3 
and consists of the UGT2A and UGT2B subfamilies. The UGT2B 
genes that have been analyzed to date, viz., rat UGT2B1 (Mackenzie 
and Rodbourn 1990) and UGT2B2 (Haque et aL, 1991), and human 
UGT2B4 (Monaghan et aL, 1997), UGT2B7 (Ishii et aL, 2000), 
UGT2B15 (Turgeon et aL, 2000), and UGT2B17 (Beaulieu et aL, 
1997), consist of six exons and have highly similar exon/intron 
boundaries. Accumulating evidence demonstrates that many of the 17 
human UGTs characterized to date exhibit tissue-specific patterns of 
expression. For example, UGT1A3 is found in the liver, kidney, and 
prostate, and throughout the gastrointestinal tract, whereas UGT1A8 
is mainly restricted to the colon (Mojarrabi and Mackenzie, 1998). 
The factors that govern this specificity of UGT expression remain 
largely unknown. 

Studies on rodent UGT genes have demonstrated that the transcrip- 
tion factors hepatocyte nuclear factor 1 (HNF1) and CAAT-enhancer 
binding protein are important positive regulators of UGT expression 
in the liver (Hansen et aL, 1997, 1998). These nuclear proteins, which 
are members of the homeodomain and basic region-leucine zipper 
groups of transcription factors, respectively, have been shown to be 
important in the liver-specific expression of other genes such as 
albumin (Lichtsteiner et aL, 1987). Furthermore, mice in which the 
hepatic expression of these factors has been substantially reduced 
display deficiencies in hepatic UGT1 Al and UGT2B expression (Lee 
et aL, 1997). With the recent isolation and characterization of several 
human UGT genes, evidence is accumulating that many of the factors 
that contribute to liver-specific expression of UGTs in rodents are also 
important in regulating UGTs in human liver. This is exemplified by 
studies on the human UGT2B7 gene (Ishii et al., 2000). Studies on this 
UGT gene were initiated since UGT2B7 glucuronidates many impor- 
tant therapeutic drugs, including nonsteroidal anti-inflammatory 
drugs, morphine, oxazepam, and zidovudine, and is found in both the 
liver and the gastrointestinal tract (Radominska-Pandya et aL, 1999). 

The proximal promoter of the UGT2B7 gene contains an A/T-rich 
region just upstream from the transcription start site. Cotransfection 
experiments in the human liver-derived cell line, HepG2, and gel 
electrophoretic mobility shift assays demonstrated that HNFla bound 
to this region to enhance UGT2B7 promoter activity. The closely 
related transcription factor HNF1/3 had little effect on UGT2B7 pro- 
moter activity. Furthermore, mutations in the A/T-rich region abol- 
ished the capacity of HNFla to enhance promoter activity. Further 
investigation is required to define other factors that regulate the 
UGT2B7 gene, either independently or via interactions with HNFla. 
In this respect, it is interesting to note that the ubiquitous transcription 
factor, octamer transcription factor- 1, acts synergistically with 
HNFla to enhance UGT2B7 promoter activity in HepG2 cells. Oc- 
tamer transcription factor- 1 is another member of the homeodomain 
group of transcription factors. It acts as a positive regulator of 
UGT2B7 expression by tethering to HNFla and presumably promot- 
ing assembly of the RNA polymerase complex, rather than by directly 
binding to DNA (Ishii et al., 2000). 

A comparison of the proximal promoter sequences of several hu- 
man UGT genes reveals the presence of a similar A/T-rich region in 
many of these genes (Fig. 1). The binding of HNF1 a to this sequence 
has been confirmed for the UGT1A1 and UGT2B17 genes (Bernard et 
al., 1999; Gregory et al., 2000). It is interesting to note that this 
putative HNF1 binding site is present in a similar position in the 
proximal promoters of those UGT genes that are expressed in the 



UUT1AI AA CT OLVTLTCT A CCTTTUTU CiA CTOACAUCTTTTT A C Ai,"TCACLTUACA CA - - OTCAJL*CATTAACT TtXTn/TATVCA TTtX." 

U372 B2 1 P SGftCTCACGATCTTCftTATftATG TATT7ACTTTGA - ATTGAAOCA - • OTTA TOI mA ACTT CATTGA 

US72B25P G - ACT CAGGA7GT7CATATAA7G TATTTACTTTGA-ATTGAAGGA- -GlTATCTTTrAACTT GATTGA 

UGT2B11 CTACTCAGGATCCTCCTA7AATA TATTTACTTTG3 - ATTGGAGGA - -OTTATATTTTAACTT GATTGA 

UCT2B27 P CfTACTCAOGATGCTCATATAATC TATTTACTTTGA - ATTCAACGA - - OTTATOTTTTAACTT GATTCA 

UGT2B15 CTATTCAGACTOTTAOTATTATO TATTTACTTCAA - ATTTTAGCA - . OTTA TA TTTTAA CT7 CATTGA 

tf3T2B7 AT OCT CAGACrCTTGATTTAATGAT ATTGT ATG1 ACT TTGA - CTT A TAAGG - • GTTA CA TTTTAACT TCTT OCCTAA 

U3T2 B2 C P ATGCTCAGAC7GTTAATCTAATGATAATATATTTCCTTTGA -AGTGTAAAA - - OTTACA J J IU AACTTCCT GATTCA 

L'3T*I14 AlWCAGACrG fTAA'IATAA'It; T- - AIT rACTrrGA - AGTGTAAAA - - OTTACA TTTTAAC 1" CC IT GaCTGA 

TOT2B1-7 --ACTCA A- -A TTTTAGCA - - CTTArATTTTAACT T GATTCA 

CGT2B29P ATOATAA- - - TTT AGAAATTATATOG T AACTGCTACTTTGA •ATTCTAGCA - - OTTATA CJ 1 JC ACTT GATTCA 

ACCCCACTC TCTCTTCCA A TTACA OTTTBATTTC CT A ACTCCC 

UST1A4 TTATACATTAATGGGTAATAAGTAACTGGAGGAGG3CACTT - TGTCTTCCA - -ATTACATGCTGATTTGCTA GGTGGC 

UGT1A2P TGATOATTTGCTA AGTGGC 

CGT1A5 CA- .ATTACATOTTOATTTCCTA CCTGTC 

UCT1A6 TCAACCTCACA CCCCCATACTTCCTTCATATTAACCATCTCATTAAAATC OTTAAA TA TTAA TTTCCC TTCTTA 

UGT1AB AT3AQTAAATCAT7GGCAGTGAGTGTGATT TT T TTT T T T* M T ATGA CAGGATCCCTACACGCCCTCTATTGG GGTCAG 

uutiai o ATaAGTAAATCATTWCAcfr^TOTJJVTrrrrrr rr r r f n atuaaacgataaata cacuccctct attuo ggtcag 

UCT1A7 ATGACTAAATCATTCC CACTCAATCTCAA I- l T 1 1 T Tl J A AATCAATCAATAACTACACCCC r fCT 1*1 TCA CCCCAC 

UGT1A1 1 P ATGACTAAATCATTGGCAGTATCTGT^TTTTTT^ gggcag 

CGT1AI 2 p AAACCCCCATTAC - CC-CACAGGGCATGAAC7GTCCCCAGG - - - GCAAAGAA - - CATA - AGGTAGTCTTGTGT GCAAAA 

Fig. 1. Alignment of UGT proximal promoter sequences. 

The sequences, including those of UGT pseudogenes (denoted by P), were 
obtained from GenBank. accession numbers AF 180372 (UGT1A1), M84126 
(UGT1A2P), M84127 (UGT1 A3), M84128 (UGT1A4), M84129 (UGT1A5), 
AF0141I2 (UGT1A6), U39570 (UGT1A7), U42604 (UGT1A8), U39550 
(UGT1 A10), U39551 (UGT1 Al IP), U39552 (UGT1 A12P), AF179875 (UGT2B4), 
AF282881 (UGT2B7), AF179873 (UGT2B11). AF179881 (UGT2B15), AF179874 
(UGT2B17), AF179876 (UGT2B24P), AF1798787 (UGT2B25P), AF179878 
(UGT2B26P), AF179879 (UGT2B27P), and AF179880 (UGT2B28P). The putative 
HNF1 binding site is shown in italics. The sites that have been demonstrated 
experimentally to bind HNF1 protein are underlined. Note that UGT1 A7, UGT I A8, 
and UGT1A10, which are not expressed in the liver, do not contain a consensus 
HNF1 binding site in a position equivalent to that of the other genes. 

liver, but it is not present in the same position in those genes expressed 
exclusively in extrahepatic tissues. 

Although many UGTs are expressed in extrahepatic tissues, the 
mechanisms that regulate their tissue-specific expression have not 
been identified. The prostate contains UGT2B17, which is thought to 
play an important role in regulating intracellular androgen concentra- 
tions (Guillemette et al., 1997). The prostate-derived LNCap cell line 
contains large amounts of this enzyme and, hence, is a suitable model 
to investigate UGT2B17 regulation. The UGT2B17 gene promoter is 
substantially activated by exogenous HNFla in transfected LNCaP 
cells. However, unlike HepG2 cells, LNCaP cells do not contain 
endogenous HNFla but do contain the related transcription factor 
HNF1/3. The latter appears to be a positive regulator of UGT2B17 
expression in these cells and exerts its effects through a site that is 
distal to the HNFla binding site utilized in HepG2 cells (Gregory et 
al., 2000). Factors specifying the expression of UGTs in other tissues 
remain to be identified. 

Much work is required to define the mechanisms that contribute to 
the differential expression of UGT genes in the liver and other organs. 
Nevertheless, it is now evident that different factors and different 
combinations of factors regulate the same promoter in different cell 
types. 

Glucuronidation of Xenobiotics and Endobiotics 
(J.K.R., F.K.K.) 

Progress is continuing in cloning and establishing the substrate 
specificities of the individual UGT enzymes that catalyze this reac- 
tion. Their identification is essential for understanding why individ- 
uals vary in their rates of glucuronidation, altering the pharmacolog- 
ical and toxicological responses to different agents. Currently, there 
are —20 distinct UGTs known in both humans and rats, the best 
characterized animal species. These can be divided into two families, 
UGT1 and UGT2, on the basis of sequence homology. Studies in our 
laboratory have focused on the substrate specificity and regulation of 
UGT1 isoforms. 

The UGT1A1 form appears to be the most abundant UGT1 isoform 
expressed in liver (Ritter et al., 1992; Ikushiro et al., 1995) and is the 
principal isoform involved in the glucuronidation of bilirubin (Bosma 
et al., 1994). UGT1A1 is also selectively active toward certain phe- 
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nols [e.g., SN-38 (Iyer et al., 1998)] and 1 7a-ethinylestradiol (Ebner 
et al., 1993; Senafi et al., 1994). The available evidence points to high 
interindividual variability in UGT1A1 expression. In a study of 
UGT1A1 variation in human liver donor samples, marked variation 
(>50-fold) was observed in the microsomal UGT1A1 protein level 
(Ritter et al., 1999). Three samples with the highest UGT1A1 level 
were from patients who had received the anticonvulsant inducing 
agent, phenytoin, during their hospitalization. These findings are 
consistent with a report of elevated UGT1 Al mRNA in the liver of a 
phenytoin-exposed patient. They also agree with clinical evidence 
supporting the effectiveness of phenobarbital in inducing bilirubin 
elimination in patients with unconjugated hyperbilirubinemia. 

Further evidence in support of environmental influences on 
UGT1A1 was provided by studies using primary human hepatocytes. 
After adaptation to culture, a common environment, UGT1 Al mRNA 
levels showed lower variability (<3- versus 50-fold). The three cul- 
tures from the phenytoin-exposed donors showed the most pro- 
nounced declines in UGT1A1 mRNA, likely due to the removal of 
inducing stimuli (phenytoin, diet). Induction studies showed that 
phenobarbital and oltipraz (prototypical inducing agents) resulted in 
elevated UGT1A1 mRNA, but exposure to 3-methylcholanthrene 
resulted in the most potent inducing effects (3- to 6-fold). These 
findings suggest that expression of UGT1A1 in human liver is under 
the control of multiple control mechanisms including the aryl hydro- 
carbon receptor. Exposure to polycyclic hydrocarbon-type inducers 
via cigarette smoking is another potential cause of UGT1 Al variation. 

The possible role of the UGT1A1 promoter polymorphism (Bosma 
et al., 1995) in the observed variation was also investigated by 
genotyping the donors for the number of TA repeats in their UGT1A1 
TATA boxes. A genetic influence on expression was supported by the 
observation that two individuals with the lowest UGT1A1 expression 
were homozygotes for the (TA) 7 TAA allele (Ritter et al., 1999). A 
third (TA) 7 TAA homo zygote was one of the phenytoin-exposed pa- 
tients. This patient exhibited lower UGT1A1 levels than did the two 
other phenytoin-exposed patients, who were both homozygous for the 
wild-type allele [(TA) 6 TAA]. These data provide support for both 
genetic and environmental factors in interindividual variation in he- 
patic UGT1A1 expression. 

The Gunn rat provides a useful animal model to investigate the 
relative contribution of UGT1 isoforms in total glucuronidating ac- 
tivities. The frame-shift mutation associated with the loss of UGT1A1 
activity and hyperbilirubinemia in Gunn rats also inactivates the other 
UGT1 family isoforms. The contributions of UGT1 isozymes were 
assessed using various in vitro (microsomal UGT assays) and in vivo 
approaches (pharmacokinetics and organ toxicity). Two examples are 
the analgesic and potential hepatotoxicant, acetaminophen (de Morais 
et al, 1992a), and the toxic environmental pollutant, benzo[a]pyrene 
(B[a]P) (Hu and Wells, 1992). Establishing the identities of UGT1 
isoforms involved in the glucuronidation of specific substrates re- 
quires the use of cloned and expressed UGT cDNAs. Using human 
embryonic kidney cells expressing the major UGT1 isoforms found in 
rat liver [UGT1A1, UGT1A5, UGT1A6, and UGT1A7 (Ikushiro et 
al., 1995)], we investigated the selectivities of these isoforms in the 
glucuronidation of bilirubin, acetaminophen, and B[a]P metabolites. 
Only the UGT1A1 isozyme was active toward bilirubin. In contrast, 
two of the four rat liver isoforms tested were active toward acetamin- 
ophen (UGT1A6 and UGT1A7) (Kessler et al., 2002). In rats main- 
tained on a standard laboratory diet, UGT1A6 and UGT1A7 are 
expressed at low levels in liver but are induced after exposure to 
certain inducers (Grove et al., 1997; Kessler and Ritter, 1997; Koba- 
yashi et al., 1998). These data likely indicate an important role for 
UGT1A6 and UGT1A7 in protection against acetaminophen-induced 



hepatotoxicity. The activities of these isoforms resembles the reported 
activities of human UGT1A6 and UGT1A9 toward acetaminophen 
(Bock et al., 1993). Apparent differences in the affinity and/or capac- 
ity of UGT1A6 (high affinity, low capacity) and UGT1A9 (low 
affinity, high capacity) for acetaminophen likely indicate that they 
will contribute differently to protection against acetaminophen in 
overdose situations. However, the possibility that other UGT1 iso- 
forms besides the phenol UGTs contribute to acetaminophen glucu- 
ronidation is suggested by the finding that human UGT1A1 is also 
significantly active (Court et al., 2001). These results highlight the 
species-specific nature of drug glucuronidation. 

Glucuronidation also is known to modulate toxicities associated 
with B[a]P exposure (Hu and Wells, 1992; Hu and Wells, 1994]. The 
high activity of the rat UGT1A7 form toward B[a]P metabolites, 
including phenols, quinols, and dihydrodiols, has been demonstrated 
(Grove et al., 1997). Despite its inducibility by polycyclic hydrocar- 
bons, UGT1A6 shows very low activity toward most B[a]P metabo- 
lites, a finding consistent with the preference of UGT1A6 for small 
phenolic compounds with simple ring substitutions. Interestingly, rat 
UGT1A1 was found to be significantly active. Qualitatively, 
UGT1A1 resembles the UGT1A7 form in its activity toward many 
different B[a]P metabolites including B[a]P-7,8-dihydrodiol. These 
findings are supported by the observation that the human bilirubin 
UGT is active toward B[a]P-7,8-dihydrodiol(Fang et al., 2002). Al- 
though the specific activities of UGT1 Al were generally 2- to 6-fold 
lower than that of UGT1 A7, some activities were higher (e.g., toward 
B[a]P-4,5-dihydrodiol). The high natural abundance of this form in 
liver, together with the observation of UGT1A1 inducibility by poly- 
cyclic aromatic hydrocarbons, supports a role for UGT1 Al deficiency 
in the mechanism of increased sensitivity of Gunn rats to B[a]P- 
induced toxic effects. Variation in UGT1A1 has the potential to alter 
individual sensitivities to polycyclic aromatic hydrocarbons present in 
the diet and the environment. 

Toxicological Relevance of UGTs (P.G.W., P.M.K.) 

Toxicologic Implications of UGT Deficiencies. We have focused 
upon drugs and environmental chemicals for which toxicity depends 
upon the bioactivation of the xenobiotic or a stable metabolite by 
enzymes like the cytochromes P450 (P450s) or prostaglandin H 
synthases to highly toxic electrophilic and/or free radical reactive 
intermediates that damage cellular macromolecules (DNA, protein, 
lipid) and/or enhance oxidative stress (for reviews, see Wells and 
Winn, 1996; Wells et al., 1997). In such cases, glucuronidation and 
elimination of the xenobiotic and/or its stable metabolite serves as a 
toxicological gatekeeper, directing metabolism away from toxifying 
bioactivation. Furthermore, bioactivation often is a quantitatively mi- 
nor pathway, often amounting to less than 10 to 15% of overall 
elimination, whereas glucuronidation usually is quantitatively major, 
constituting 40 to 75% or more of xenobiotic elimination. Hence, 
relatively minor deficiencies in glucuronidation theoretically can re- 
sult in a substantial percentage increase in bioactivation, with toxico- 
logical consequences even at therapeutic drug concentrations or pu- 
tativeiy safe concentrations of environmental chemicals. 

Hepatic and Renal Toxicity. The widely used analgesic drug 
acetaminophen (paracetamol) at high doses is hepatotoxic and neph- 
rotoxic, and eliminated primarily via UGT-catalyzed glucuronidation. 
Mutant Gunn and RHA rats, which lack all UGT1A enzymes (re- 
viewed by Iyanagi et al., 1998), had reduced acetaminophen glucu- 
ronidation, enhanced P450-catalyzed bioactivation of acetaminophen 
and covalent binding to hepatocellular proteins, and enhanced hepa- 
tocellular centri lobular and renal cellular necrosis (de Morais and 
Wells, 1988, 1989; de Morais et al., 1992a). Interestingly, the ho- 
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mozygous UGTlA-deficient mutant rats exhibited measurable, albeit 
substantially reduced, acetaminophen glucuronidation, suggesting the 
likely contribution of a UGT2 isozyme. In people with a hereditary 
UGT1A1 deficiency (Gilbert's syndrome) (review: Tukey and Strass- 
burg, 2000), similar studies using intravenous drug administration 
showed reduced glucuronidation of acetaminophen and, conversely, 
enhanced bioactivation determined by the formation of glutathione- 
derived acetaminophen metabolites, although measurable hepatotox- 
icity was not evident at the therapeutic dose employed (de Morais et 
al., 1992b). Among all the Gilbert's subjects and controls, a greater 
reduction in glucuronidation correlated highly with a greater enhance- 
ment in bioactivation. Conflicting data have been reported in other 
studies, with reduced acetaminophen glucuronidation apparent in 
some Gilbert's subjects and no effect in others (Ullrich et al., 1987; 
Esteban and Perez-Mateo, 1999). In our study, one subject with 
Gilbert's syndrome had normal acetaminophen glucuronidation, 
whereas one of the "normal" subjects without Gilbert's syndrome had 
deficient acetaminophen glucuronidation and enhanced bioactivation. 
These findings suggest that acetaminophen glucuronidation is more 
complex than previously thought, and may be explained by multiple 
UGT isozymes contributing to its glucuronidation, including 
UGT1A1 and 1 A6. The extent to which each isozyme contributes may 
differ significantly among individuals, depending on exposure to 
inducing agents, and genetic and other factors. Interestingly, a recent 
study found that 8% of people with Gilbert's syndrome also possessed 
a homozygous deficiency in the UGT1A6 gene (Lampe et al., 1999). 

Carcinogenesis. The environmental teratogens and/or carcinogens 
benzo[a]pyrene and 4-(methylnitrosamino)-l-(3-pyridyl)-l-butanone 
(NNK) are representative of a multitude of polycyclic aromatic hy- 
drocarbons and nitrosamines found widely in the environment, par- 
ticularly in tobacco smoke. Most benzo[a]pyrene and NNK metabo- 
lites are eliminated primarily via glucuronidation, catalyzed by several 
UGT1A and UGT2 isozymes (Grove et al., 2000; Tukey and Strass- 
burg, 2000), which avoids the alternative P450- and/or prostaglandin 
H synthase-catalyzed bioactivation of the metabolites to toxic reactive 
intermediates that irreversibly damage DNA (for reviews, see Wells 
and Winn, 1996; Wells et al., 1997). Using UGTlA-deficient Gunn 
and RHA rats, the glucuronidation of benzo[a]pyrene metabolites was 
found to be reduced in vitro and in vivo, resulting in their enhanced 
bioactivation and covalent binding to both protein and DNA (Hu and 
Wells, 1992). UGT deficiency was the critical toxicologic determi- 
nant, since these animals compared with controls had similar activities 
of various P450 isozymes and glutathione 5-transferase isozymes and, 
in the case of the RHA strain, the controls were congenic (Hu and 
Wells, 1992). To estimate carcinogenic risk in UGT-deficient rats, a 
skin fibroblast model was developed, with genotoxicity and potential 
carcinogenicity assessed by the formation of micronuclei (Vienneau et 
al., 1995; Kim and Wells, 1996a). Cultured fibroblasts from UGT- 
deficient Gunn and RHA rats incubated with either benzo[a]pyrene or 
NNK had increased oxidative DNA damage and micronucleus forma- 
tion compared with UGT-normal controls, whereas no increase in 
micronuclei occurred with benzo|>]pyrene, a noncarcinogenic isomer. 
Benzo[a]pyrene- and NNK-initiated micronucleus formation was de- 
pendent upon P450- and/or peroxidase-catalyzed bioactivation (Kim 
and Wells, 1996a; Kim et al., 1997a). 

Cellular Models for Human Risk Assessment. To facilitate hu- 
man studies, we have evaluated lymphocytes, which can be obtained 
in relatively large quantities sufficient for determination of collective 
UGT activity. To determine whether lymphocytes accurately reflect 
hepatic activities, lymphocytes and hepatic microsomes were taken/ 
prepared from the same RHA UGT-normal (+/+) and UGT-deficient 
(j/j) rats (Hu and Wells, 1994). To produce benzo[a]pyrene metabo- 



lites as substrates for glucuronidation or ultimate bioactivation, ben- 
zo[a]pyrene was preincubated with rat liver microsomes and NADPH. 
and the supernatant was immediately added to the lymphocyte incu- 
bations. Lymphocytes from UGT-deficient rats accurately reflected 
the decreased glucuronidation of benzo[a]pyrene metabolites and 
enhanced bioactivation, covalent binding, and cytotoxicity that were 
observed with hepatic microsomes from the same rats (Hu and Wells, 
1994), and with related in vivo studies of benzo[a]pyrene glucu- 
ronidation, bioactivation, and covalent binding (Hu and Wells, 1992), 
and embryotoxicity (Wells et al., 1989). Lymphocytes may constitute 
a useful model for risk assessment. 

In preliminary human studies, lymphocytes from 12 normal volun- 
teers were tested as described in the rat lymphocyte studies above (Hu 
and Wells, 1993). All subjects had normal UGT activity for bilirubin 
(i.e., none had Gilbert's syndrome), but there was a 200-fold variabil- 
ity in UGT activities for benzo[a]pyrene metabolites, including two 
with no measurable activity. Decreasing UGT activity correlated with 
decreased UGT-dependent protection against benzo[a]pyrene cova- 
lent binding and, conversely, with increased cytotoxicity for benzo- 
[ajpyrene quinones and diols, but not monophenols. A similar pro- 
tection against benzo[a]pyrene quinones, but not 3-OH- 
benzo[a]pyrene or benzo[a]pyrene-7,8-dihydrodiol, was shown in 
human lymphoblastoid cells transfected with rat UGT1A7 (drove et 
al., 2000). Our lymphocyte studies suggest that substantial UGT 
deficiencies for potentially toxic benzo[a]pyrene metabolites are com- 
mon in the normal population and indicate that these UGT deficien- 
cies may constitute important determinants of toxicologic predisposi- 
tion, particularly with respect to chemical carcinogenesis and 
teratogenesis. 

Developmental Toxicity. Whereas there is little UGT activity in 
the rodent embryo, maternal UGT activity theoretically may play an 
important role in determining how much teratogen reaches the em- 
bryo. In preliminary studies, pregnant UGT-deficient Gunn rats were 
substantially more susceptible to benzo[a]pyrene-initiated embryonic 
death at a subcarcinogenic dose (25 mg/kg i.p.) that had no effect on 
UGT-normal Wistar controls (Wells et al., 1989). Similarly, the 
anticonvulsant drug phenytoin, a human teratogen, and its major 
metabolite, 5-(/?-hydroxyphenyl)-5-phenylhydantoin, caused in- 
creased DNA oxidation and micronucleus formation in UGT-deficient 
cultured rat skin fibroblasts, and in vivo studies indicated that this 
enhanced genotoxicity was due to decreased W-glucuronidation of 
phenytoin and O-glucuronidation of 5 -(/?-hydroxyphenyl)-5 -phenyl - 
hydantoin in the UGT-deficient rats (Kim et al., 1997b), resulting in 
increased hydroxyl radical formation (Kim and Wells, 1996b). Pre- 
liminary in vivo evidence suggests a similar enhancement in embry- 
opathies in phenytoin-treated pregnant Gunn and RHA UGT-deficient 
rats treated with phenytoin (Kim and Wells, 1998). In the last trimes- 
ter of pregnancy, increasing activities of some fetal UGTs may pro- 
vide the offspring with a second layer of biochemical protection, 
particularly against functional (as distinct from structural) anomalies, 
although this has yet to be studied. 

General Toxicologic Observations. Overall, several consistent 
observations emerged from the above studies. First, a relatively small 
percentage decrease in a quantitatively major pathway of elimination, 
such as glucuronidation, can produce a disproportionately large per- 
centage increase in bioactivation. This effect is particularly evident if 
there are no alternative eliminating pathways, or if the alternative 
pathways are readily saturable, as is the case with sulfation. For 
example, even in the rat, which has at least twice the sulfating 
capacity of mice and humans for acetaminophen, there was no com- 
pensatory enhancement of acetaminophen sulfation in UGT-deficient 
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rats (de Morais et ah, 1992a). The toxicological consequences in 
humans may be more pronounced. 

Second, consistent, progressively greater deficiencies in UGT ac- 
tivity in +/j and j/j RHA rats resulted in corresponding UGT gene 
dose-dependent decreases in the glucuronidation of both acetamino- 
phen and benzo[a]pyrene in vitro and/or in vivo and, conversely, 
increasing xenobiotic bioactivation, covalent binding, and toxicity, all 
of which were reflected in the lymphocyte model. A consistent finding 
of particular clinical interest is that a heterozygous deficiency in 
UGTs was toxicologically relevant (de Morais et al., 1992a; Hu and 
Wells, 1992, 1994), with a risk equivalent to homozygotes in some 
systems (Kim and Wells, 1996b; Kim et al., 1997b). 

Finally, despite our improving understanding of the pharmacog- 
enomic basis for UGT deficiencies and the specific roles of UGT 
isozymes in xenobiotic glucuronidation (reviews: Tukey and Strass- 
burg, 2000; Guillemette, 2003), the complexity" of toxicological risk 
remains challenging. At a given xenobiotic concentration, individual 
toxicological susceptibility will depend upon both the levels of the 
relevant UGT proteins, which vary according to genetic and environ- 
mental determinants, and the overall balance among numerous asso- 
ciated biochemical pathways, including elimination via other conju- 
gating pathways, membrane transport, bioactivation, reactive 
intermediate detoxification, and macromolecular repair, among oth- 
ers. Environmental modulation of the in vivo outcome can be partic- 
ularly unpredictable, as exemplified by acetaminophen toxicity, which 
in rodents is reduced by the UGT inducer oltipraz due to enhanced 
glucuronidation (Davies and Schnell, 1991; Kessler et al., 2002), but 
conversely is enhanced by pretreatment with the UGT inducers phe- 
nobarbital and 3-methylcholanthrene, presumably due to their rela- 
tively greater induction of P450-catalyzed bioactivation. 

Pharmacogenetics of UGT Enzymes: Implications for Cancer 
Susceptibility (C,G.) 

The molecular genetics of the UGT superfamily are well under- 
stood (Mackenzie et al., 1997; Gong et al., 2001), but the molecular 
mechanisms of large interindividual phenotypic variations remain to 
be elucidated. Breakthroughs in the identification of functional com- 
mon genetic variations in UGT genes that may impact drug response 
and/or disease susceptibility are starting to emerge. Polymorphic 
variations that affect functional activity of the enzyme have been 
characterized in human UGT1A and UGT2B genes. Similar to other 
drug-metabolizing enzymes, evidence of differential prevalence 
among ethnic and racial groups have been observed in human popu- 
lations (Fig. 2) (reviewed in Guillemette, 2003). 

The first evidence suggesting alterations in UGT genes as a genetic 
risk factor of cancer was recently obtained. In a first study, we 
hypothesized that constitutive alteration in UGTs involved in the 
inactivation of estradiol and its catechol-reactive metabolites may 
modify estrogen exposure and, consequently, estrogen-related cancer 
risk. We investigated UGT1A1 as a first candidate gene. UGT1A1 is 
a major UGT expressed in mammary gland and involved in the 
formation of estradiol-glucuronide (Senafi et al., 1994; Guillemette et 
al., 2000a). The most common genetic variant in the UGTIAJ gene is 
a dinucleotide repeat polymorphism in the atypical TATA-box region 
of the UGT1A1 promoter (Bosma et al., 1995). Using genetic epide- 
miological studies designed as population-based case-control studies, 
we first observed that the UGT1A1*28 (A(TA)7TAA) and the 
UGT1A1*34 (A(TA)8TAA) promoter alleles were associated with an 
increased risk of developing invasive breast cancer in premenopausal 
African-American women (OR = 2.1; 95% confidence interval, 1.0- 
4.2; p = 0.04) (Guillemette et al., 2000a). This finding is consistent 
with a role for estrogen- UGT in modulating the action of endogenous 



hormones in breast cancer risk in the African-American population. 
However, in a nested case-control study of white women within the 
Nurses' Health Study cohort, we were unable to detect a significant 
risk associated with the low transcriptional allele UGT1 A 1*28 (Guil- 
lemette et al., 2001). In the same population, an elevation of estradiol 
among women who are carriers of at least one UGT1 A 1*28 allele was 
observed and suggests a possible contribution of the glucuronidation 
pathway, and especially UGT1A1, in the maintenance of hormone 
homeostasis, although not sufficient to alter breast cancer risk in white 
women. We further studied the relationship between UGT1A1 poly- 
morphisms and variation in breast density, a predictor of breast 
cancer. Premenopausal women homozygous for the UGT1A1*28 
allele were found to have significantly lower breast density compared 
with those with the *1/*1 genotype (-43.1% difference; p = 0.04). In 
contrast, postmenopausal women with the UGTlAl*28/*28 genotype 
had greater breast density compared with those with the * 1/* 1 geno- 
type (+32.0% difference; p = 0.05), which was even greater among 
current postmenopausal hormone users (+56.8% difference; p = 
0.03). These results suggest that UGT1 Al genotype is a predictor of 
breast density within groups of different menopausal status and sup- 
port that interindividual differences in estrogen glucuronidation influ- 
ence local estrogen concentration and breast density (Haiman et al., 
2003). 

Recent studies support the idea that UGTs play a significant role in 
the detoxification of environmental carcinogens and therefore repre- 
sent good candidates for low-penetrance susceptibility genes that 
possibly contribute to cancer risk by amplifying the effects of carcin- 
ogen exposure. Among these, UGT1A7 is an important extrahepatic 
UGT and represents one of several UGTs that were shown to be active 
on chemical carcinogens. In a recent study, we identified three com- 
mon UGT1A7 allelic variant isozymes that encode novel UGT1A7 
proteins differing in primary structure at three amino acid positions 
(Fig. 2) (Guillemette et al., 2000b; Woolley et al., 2000). Prevalence 
studies confirmed the presence of all four UGT1A7 alleles in the 
African- American and Asian populations, although at various fre- 
quencies. Functional studies revealed significant differences in cata- 
lytic activity of the UGT1 A7 alleles toward benzo(a)pyrene metabo- 
lites, known tobacco-carcinogens, as well as for a number of 
additional substrates (Guillemette et al., 2000b). UGT1A7 was shown 
to be expressed in normal orolaryngeal tissue specimens including 
tongue, tonsil, floor of mouth, and larynx, which are target tissues of 
carcinogenic environmental molecules (Zheng et al., 2001). The 
UGT1A7 allele, associated with low activity, was subsequently asso- 
ciated with an increased risk of orolaryngeal cancer (Zheng et al., 
2001). Genotypes containing different combinations of the two lowest 
activity alleles, UGT1A7*3 (129 Lys 131 Lys 208 Arg ) and UGT1A7*4 
(129 Arg 131 Asn 208 Arg ), were strongly linked to an increased risk for 
orolaryngeal cancer in white (OR = 2.8; 95% 1C 1.1-7.6) and in 
African-American people (OR = 6.2; 95% IC 1.2-31) compared with 
the wild-type genotype (homozygous for the UG1A7*1 allele 
[I29 Arg 131 Asn 208 Trp ]). Upon stratification by cancer site, predicted 
low-activity UGT1A7 genotypes were strongly linked to increased 
risk for both oral cavity (OR = 4.2; 95% IC 1.7-0) and laryngeal 
cancers (OR = 3.7; 95% ■ IC 0.99-14). In addition, subjects with 
low-activity genotypes who were light or heavy smokers had a sig- 
nificantly increased risk compared with the wild- type genotype 
(OR = 3.7; 95% IC 1.1-12 and OR = 6.1; 95% IC 1.5-25, respec- 
tively) (Zheng et al., 2001). These results revealed for the first time 
that genetic variations in the UGT1A7 gene that reduce the carcino- 
gen-detoxifying activity increase the risk of developing a smoking- 
related orolaryngeal cancer. The association of UGT1A7 alleles with 
risk of hepatic and colorectal cancers was further demonstrated (Vo- 
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Fig. 2. Common genetic polymorphisms in UGT1A and UGT2B genes. 

A, common genetic polymorphisms at the VGT1A gene locus. The entire UGT1 family is derived from a single gene locus (UGTl), located on chromosome 2q37, which encodes nine 
functional proteins (UGTlAl, and UGT1 A3 to UGTl AlO) and four pseudogenes (UGT1 A2p, and UGT1 Al lp to UGTl Al3p) (Ritter et ah, 1 992; Gong et al.. 2001). B. common genetic 
polymorphisms in functional UGT2B genes. The UGT2B family comprises several distinct genes and pseudogenes, which are not included here. The genomic organization of functional 
UGT2B genes is not entirely elucidated; therefore, the relative chromosomal localization used here was simplified for schematic purposes. Sequences that differ by less that 3% are considered 
alleles of the same gene (Mackenzie et al., 1997). To date, allelic variants have been reported for UGTl A 1 , UGT1A6, UGTl A7, UGT1A8, and UGTl A9, in addition to UGT2B4, UGT2B7, 
UGT2B15, and UGT2B28. The function and prevalence of some of these variants have also been described (Guillemette et al.. 2000, 2001 ; WoolJey et al., 2000; Girard et al., 2003; Beutler 
et al., 1998; Bosma et al.. 1994; Ciotti et al., 1997; Coffman et al., 1998; Lampe et al.. 1999. 2000; Guillemette et al.. 2000; Hall et al., 1999; Riedy et al., 2000; Levesque et al., 1997, 1999; 
Turgeon et al., 2000; Akaba et al., 1998; Maruo et al., 1999; Huang et al., 2002; Villeneuve et al., 2003). A Recent studies support a possible role of UGT in cancer risk (Guillemette et al.! 
2000. 2001 ; Zheng et al., 2001 ; Vogel et al., 2001 ; Strassburg et al., 2002; Gsur et al.. 2002; MacLeod et al.. 2000; GesU et al.. 2002), although additional studies are needed to confirm these 
findings. 8 . 0 The structure of the UGTl gene presented here is based on the GenBank accession number AF297093 (Gong et al., 2001). b UGTl A5 alleles correspond to- UGTl A5* 1 
Ala 158 His 22 W 48 Val 249 Gly 259 ; UGT1A5*2 Gly'^His^Leu^Val^Gly 239 ; UGT1A5*3 Ala'^yi^^Leu^Val^Gly 259 ; UGT1A5M Ala'^His^Ile^Leu^Are 239 * 
UGT1A5*5 Gly'^His^Ile^^u^Arg 259 ; UGT1A5*6 Ala'^yr^Ue^W^Arg 259 ; UGT1A5*7 Gly'^yi^lle^W^Arg 2 * ' UGT1A7 alleles correspond I to : " 
UGT1A7M Gly'^Asn'^Arg'^Glu'^rp 208 ; UGT1A7*2 Gly ,,5 Lys 129 Lys ,3, Glu ,39 Trp 208 ; UCT1A7*3 Gly 115 Lys ,29 Lys ,3l Glu ,39 Arg 208 ; UGT1A7*4 
Gly I,3 Asn l29 Arg 131 Glu ,39 Arg 2 ° 8 ; UGT1A7*5 Ser 115 Asn 129 Arg ,3, Glu ,39 Arg 208 ; UGT1A7*6 Gly ,,3 Asn 129 Arg 131 Glu ,3y Trp 208 ; UGT1A7*7 GIy ,13 Lys I29 Lys ,3, Asp 139 Trp 208 
UGT1A7*8 Gly M3 Lys 129 Lys ,3, Asp ,39 Arg 208 and UGTl A7*9 Ser'^Lys'^ys^'Glu'^r 208 . d UGTl A9 alleles correspond to: UGTl A9*l Cys 3 Met 33 ; UGTl A9*2 Ty^Met 33 ; 
UGTl A9*3 Cys^Thr 33 (Villeneuve et al.. 2003); UGTl A9*4 Tyr^X (Y. Saito. unpublished data), and UGTl A9*5 Asp 246 Asn (Jinno et al.. 2003a). e UGTl AlO alleles correspond 
to: UGTl A 10*1 Met'Thr 202 ; UGTl A10*2 Ue 39 !^ 02 ; and UGTl A10*3 Met^Ile 202 (Jinno et al., 2003b). f The relative positions of the UGT2B4, UGT2B7. and UGT2BJ5 genes 
on chromosome 4ql3 are based on the data reported by Riedy et al. (2000). g Polymorphic expression of two truncated UGT2B28 variants (type n and type III) has been reported 
(Levesque et al., 2001 ). UGT2B28 type II differs from type I by a deletion of 308 bp in the cofactor binding domain, whereas UGT2B28 type III lacks 351 bp in the putative substrate 
binding domain. h An additional cDNA clone isolated from human liver corresponds to the UGT2B4 Phe ,09 Leu, Phe 396 Leu allele but appears to be very rare since it was not found 
in two independent population studies. 
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gel et al., 2001; Zheng et al., 2001; Strassburg et ah, 2002). Recently, 
we conducted a case-control study, 400 cases and 400 controls 
matched for age, sex, and race, to assess the relation between char- 
acteristics of meat consumption, heterocyclic amine (HCA) exposure, 
the UGT1A7 genotype, and colon cancer. No main effect of the 
UGT1A7 genotype was observed on colon cancer risk. On the other 
hand, the association between dietary HCA exposure and colon cancer 
was modified in individuals with the low-activity UGTI A7 genotypes 
(C. Guillemette, unpublished observations). These data suggest that 
the relation between dietary sources of HCA and colon cancer may be 
modulated by the UGT1A7 detoxification pathway. These results also 
point to HCA exposure as an important etiologic factor in colon 
cancer. Altogether, these findings warrant additional epidemiological 
studies to confirm the role of low UGT1A7 conjugator genotypes in 
risk for cancers. Besides, our group recently discovered two additional 
UGT1A7 single nucleotide polymorphisms, found exclusively in Af- 
rican-American subjects, which generate five additional alleles 
(UGT1A7*5 to *9) when combined with the four known single 
nucleotide polymorphisms present in UGT1A7*2, *3, and *4. Upon 
functional analysis, several of these UGT1A7 variant isozymes ex- 
hibited much lower glucuronidation activities compared with 
UGT1A7*1 ; their role in cancer remains unverified at the present time 
(Girard et al., 2003). 

In conclusion, much research in this area is needed, although 
promising leads have emerged regarding the role of UGT genetic 
polymorphisms in cancer etiology and on their possible implication in 
modulating the degree of exposure to several carcinogenic com- 
pounds. 

Gene Therapy for UGT Deficiencies (J.R.C., N.R.C.) 

-Genetic lesions of UGTI A 1 can result in three grades of hyperbi- 
lirubinemia in humans. Insertions, deletions, or mutations of any of 
the five exons encoding UGTI A 1 that cause near-complete loss of the 
enzyme activity result in the potentially lethal disorder, Crigler-Najjar 
syndrome type 1 (CN-i). Mutations causing lesser degrees of reduc- 
tion of UGTI A 1 activity cause a milder form of the disease, termed 
Crigler-Najjar syndrome type 2. The third grade of hyperbilirubinemia 
is an even milder form, termed Gilbert's syndrome, which results 
from the insertion of TA dinucleotides within the TATA A element of 
the UGTI A 1 promoter or base substitutions in the UGTI A 1 coding 
region (Roy Chowdhury et al., 2001). Of these three disorders, CN-1 
is associated with levels of unconjugated hyperbilirubinemia that are 
severe enough to cause bilirubin encephalopathy. Before the routine 
use of phototherapy, CN-1 was generally lethal during infancy. Al- 
though phototherapy permits survival beyond adolescence, it becomes 
progressively less effective around puberty, so that the risk of ker- 
nicterus persists life-long. Currently, liver transplantation is the only 
definitive therapy for CN-1. However, discovery of the molecular 
bases of inherited jaundice, and advances in the techniques of hepa- 
tocyte transplantation and nucleic acid transfer to the liver have 
brought gene therapy for Crigler-Najjar syndrome close to reality. 
Gene therapy methods can be classified into approaches based on 
isolated hepatocyte, and methods using gene delivery in vivo. 

Hepatocyte-Based Gene Therapy. In the simplest form, normal 
genes may be introduced into the liver of patients by transplantation 
of allogeneic normal human hepatocytes. Hepatocytes introduced into 
the portal circulation by infusion into the portal vein or injection into 
the splenic pulp integrate into normal liver chords of structurally 
normal liver with remarkable rapidity and function on a long-term 
basis (Gupta et al., 1991). Immunosuppression is required for preven- 
tion of allograft rejection. This method should be particularly useful 
for diseases, such as CN-1, in which the liver architecture is normal. 



Because the host liver remains intact, the metabolic cost of graft 
rejection is limited. A 10-year-old girl with CN-1 was the first to 
receive liver transplantation (Fox et al., 1998). Introduction of 7.5 X 
10 9 normal isolated hepatocytes into the liver through a percutane- 
ously placed portal vein catheter reduced serum bilirubin to approx- 
imately half the pretransplant level for over 2 years. This study 
demonstrated the safety and efficacy of hepatocyte transplantation, 
but despite the replacement of approximately 5% of the hepatic 
UGTI A 1 activity, the need for phototherapy was not obviated (Roy 
Chowdhury et al., 1998). It appears that complete cure of metabolic 
liver diseases will require repeated hepatocyte transplantation or pref- 
erential proliferation of the transplanted hepatocytes over the host 
cells. 

To solve many of the lingering problems associated with hepato- 
cyte transplantation, extensive studies are ongoing in Gunn rats that 
are both a molecular and pathophysiological model of CN-1. Such 
preferential proliferation requires a strong mitotic stimulus to the 
liver, to which only the engrafted cells, but not the host hepatocytes, 
can respond. This was achieved by the combination of preparative 
irradiation of the liver and partial hepatectomy, prior to hepatocyte 
transplantation (Guha et al., 1999, 2002). Controlled irradiation of the 
liver prevents proliferation of the host Gunn rat hepatocytes. Conse- 
quently, transplanted congeneic normal hepatocytes progressively re- 
populate the liver almost completely by 12 weeks, fully normalizing 
serum bilirubin levels. 

Another approach involves isolating hepatocytes from a resected 
liver segment, transducing the primary hepatocytes with a therapeutic 
gene in culture and subsequently transplanting the cells back into the 
donor (Roy Chowdhury et al., 1991). Because the cells are autolo- 
gous, immune suppression is not needed. However, the number of 
cells that can be harvested, transduced, and engrafted after transplan- 
tation is limited, which severely restricts the efficiency of ex vivo 
gene therapy. Some of these limitations may be overcome by condi- 
tionally immortalizing the hepatocytes, so that the transduced cells 
can be expanded in culture before transplantation (Tada et al., 1998). 

Gene Transfer in Vivo. The conventional in vivo gene therapy 
methods consist of replacing a missing functional gene by transferring 
nucleic acids to the target cells using viral or nonviral vectors. A 
radically new approach involves repairing genetic mutations in situ. 
These approaches, as related to the treatment of UGTI A 1 deficiency, 
are briefly described below. 

Nonviral vectors. Physical methods, such as ballistics or direct 
injection into the liver parenchyma, have resulted in a limited extent 
of gene transfer. Naked DNA can be transfected into the liver by rapid 
intravenous administration of plasmids at high volumes, causing a 
volume overload and hepatic congestion. Obviously this approach is 
not translatable to clinical application. DNA can be complexed elec- 
trostatically with polycations, which can be lactosylated or complexed 
with galactose-terminated peptides for hepatocyte-targeted delivery in 
vivo. The Iigand-DNA complex is internalized by receptor-mediated 
endocytosis via hepatocyte-specific asialoglycoprotein receptors. Al- 
though the majority of the ligand is translocated to the lysosome, a 
small fraction reaches the nucleus, where it is expressed transiently. 
The transgene expression can be prolonged to several weeks by 
performing partial hepatectomy or transient pharmacological disrup- 
tion of the microtubules (Roy Chowdhury et al., 1996) or by using 
polycations, such as polyethyleneimine, that destabilize the endoso- 
mal vesicles. An important new advance in plasmid-directed gene 
transfer has been the use of a Tel Mariner-type transposon system, 
termed the Sleeping Beauty. Expression of the Sleeping Beauty trans- 
posase results in the transposition of stretches of DNA, flanked by 
inverted/direct repeats of the transposon sequences, into host cellular 
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chromosomes,- leading to long-term transgene expression (Izsvik et 
al., 2000). 

Viral vectors. Recombinant viruses capable of transferring genes 
into cells in vitro or in vivo are commonly used for gene therapy. 
These vectors can be classified into those that remain episomally and 
those that integrate into the host genome. Recombinant adenoviruses 
are prototypes of episomal vectors. Adenoviral vectors transfer genes 
into nondividing hepatocytes in vivo with great efficiency, although 
the longevity of the episomal vector is limited. Moreover, adenoviral 
proteins are strongly immunogenic, and host humoral and cellular 
immune response precludes repeated gene transfer. Specific toleriza- 
tion of the host by administration in neonatal rats (Takahashi et al., 

1996) , intrathymic injection of adenoviral proteins (Ilan et al., 1996), 
or oral administration of small doses of adenoviral proteins (Ilan et al., 

1997) have resulted in specific host tolerance to the viral proteins and 
have permitted repeated administration of the recombinant virus and 
long-term amelioration of jaundice in Gunn rats. Recently, immuno- 
modulatory genes, such as CTLA41g, have been incorporated into 
adenoviral vectors (Thummala et al., 2002). Coexpression of such 
genes, along with the transgene, makes the virus nonimmunogenic, 
prolongs the transgene expression, and permits repeated administra- 
tion. 

Retroviral vectors integrate into the host genome. Although recom- 
binant oncoretroviruses have been used extensively for gene therapy 
(Roy Chowdhury et al., 1991 ; Tada et al., 1998), these vectors require 
host cell mitosis for integration, which is infrequent in the quiescent 
liver. Vectors based on immunodeficiency-type retroviruses (lentivi- 
ruses) that can integrate into nondividing cells are being developed for 
the treatment of UGT1A1 deficiency. Recombinant adenoassociated 
viruses (rAAVs) are also being tested in Gunn rats, although, so far, 
it has been possible to transduce only up to 5% of rat hepatocytes by 
intraportal infusion. Recent studies indicate that rAAVs do not inte- 
grate to a significant extent in immunocompetent animals, suggesting 
that, in contrast to previous expectations, the transgene expression 
may not be permanent (Nakai et al., 2001). Moreover, because rAAV 
vectors evoke a host antibody response, repeated administration may 
be problematic. Therefore, the search for newer vectors continues. 
Recombinant simian virus 40 is promising as an integrating vector 
because this T-antigen-deleted virus does not evoke host immune 
response. Recombinant simian virus 40 integrates into the host ge- 
nome progressively over several days. After infusion of a recombinant 
SV40 expressing human UGT1AI into the portal vein of the Gunn rat, 
serum bilirubin was reduced by up to 60% and remained at that level 
throughout an 18-month period of observation, suggesting a perma- 
nent therapeutic effect. There was no detectable antibody response to 
the recombinant virus, and gene transfer was repeatable upon injection 
of a recombinant S V40 vector expressing a different transgene (Sauter 
et al., 2000). These viruses can be grown and concentrated to infec- 
tious titers of 10 u to 10 12 . Thus, the recombinant SV40 appears to 
have a great potential in liver-directed gene therapy, its major limi- 
tation being a relatively small DNA packaging space (4-4.5 kilo- 
bases) (Strayer et al., 2002). 

Gene repair therapy. Site-directed gene repair in vivo is a novel 
form of gene therapy, which relies on the cellular DNA repair mech- 
anisms to correct point mutations or short deletions (Kren et al., 
1999). The method utilizes synthetic RNA-DNA chimera, which align 
to a target sequence in the genome with high specificity and effi- 
ciency. The nucleotide sequence in the chimera is complementary to 
the target genomic sequence, except for a single mismatch. After 
alignment, the mismatch between the DNA limb of the chimeric 
molecule and the complementary strand of the genomic DNA triggers 
the cell's mismatch repair enzymes, resulting in correction of the 




Fig. 3. Immunohistochemical staining of sections of Gunn rat liver with anti- 
UGT1A1 antibodies. 

For panels a-c, immunohistochemicaJ staining was performed using a human 
UGT1 A-specific monoclonal antibody. Panel a, control (untreated) Gunn rat; panel 
b, liver from a Gunn rat 1 week after the injection of a first-generation adenovirus 
expressing human UGT1 Al (10 9 pfu). Panel c, liver from a Gunn rat 1 year after the 
injection of 10 s pfu of a recombinant SV40 expressing human UGT1A1. Panel d, 
gene repair therapy in Gunn rat. Liver biopsy was performed 1 year after five 
injections of an RNA-DNA chimera designed to insert the missing guanosine 
residue in Gunn rat UGTJA1 (exon 4). Immunohistochemical staining was per- 
formed using a rabbit antiserum specific for rat UGT1A1. Please note that the 
staining of the positive cells is light, probably indicating that only one allele per cell 
is corrected. 

mutation. For correction of the single guanosine base deletion in the 
UGT1A1 exon 4 of Gunn rats, the RNA-DNA chimera was con- 
structed to contain the wild-type sequence (Kren et al., 1999). The 
chimeric molecules were complexed with lactosylated polyethylenei- 
mine or with galactosylated liposomes. After intravenous infusion of 
the carrier-chimera complex, the genetic lesions were corrected in 1 to 
10% of the alleles, resulting in the appearance of the enzymaucally 
active full-length UGT1A1 in the liver of the recipient Gunn rats. 
Function of the enzyme was demonstrated by the appearance of 
bilirubin glucuronides in bile and reduction of plasma bilirubin levels 
to 35 to 40% of pretreatment levels. Long-term follow-up indicates 
permanent correction of the genetic lesion. Figure 3 shows immuno- 
histochemical staining of liver sections from Gunn rats treated with 
recombinant adenoviral or SV40 vectors, or with synthetic DNA- 
RNA chimera for gene repair. In summary, both nonviral and viral 
vectors are being rapidly improved for overcoming the challenges of 
liver-directed gene therapy for CN-1. Hepatocyte transplantation and 
gene therapy are being developed in parallel to complement each 
other. 

It is apparent that significant progress continues to be made in our 
understanding of the UGT gene family and their roles in physiology 
and pathophysiology. Molecular factors involved in the control of 
expression of the UGTs are being identified, providing clues to how 
these genes are regulated from tissue to tissue, during development, 
and in response to certain chemical exposures. The substrate speci- 
ficities and activities of the different UGT family members in humans 
are beginning to be understood, and information on their rodent 
counterparts is now coming to light which will be valuable for 
interpretation of toxicity studies in rodents. Identification of genetic 
differences among individuals in the sequences of UGT genes and 
their impact on risks of toxicity from exposure to drugs and environ- 
mental chemicals is a particularly active area of investigation and will 
be useful for predictive risk assessment. The research comes full circle 
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with current efforts toward development of an effective gene therapy 
for treatment of Crigler-Najjar syndrome, providing a greatly needed 
alternative to liver transplantation. It should be clear that advances 
from both basic and clinical research involving animals and humans 
have provided critical insights into the role of glucuronidation and 
UGTs in health, drug therapy, and disease. 

References 

Akaba K, Kimura T, Sasaki A, Tanabe S, lkcgami T, Hashimoto M, Umcda H, Yoshida H, 
Umclsu K. Chiba H, et al. (1998) Neonatal hyperbilirubinemia and mutation of the bilirubin 
uridine diphosphate-glucuronosyltransferase gene: a common missense mutation among Jap- 
anese. Koreans and Chinese. Biochem Mol Biol Int 46:21-26. 

Beaulieu M, Levesque E, Tchemof A, Beatty BG, Belanger A, and Hum DW (1997) Chromo- 
somal localization, structure and regulation of the UGT2B17 gene, encoding a C19 steroid 
metabolizing enzyme. DNA Cell Biol 16:1143-1154. 

Bernard P, Goudonnet H, Artur Y, Desvergne B. and Wahli W (1999) Activation of the mouse 
TATA- less and human TATA-containing UDP-glucuronosyltransferase 1A1 promoters by 
hepatocyte nuclear factor 1. Mol Pharmacol 56:526-536. 

Beutler E, Gelbart T, and Demina A (1998) Racial variability in the UDP-glucuronosyltrans- 
ferase 1 (UGT1A1) promoter: a balanced polymorphism for regulation of bilirubin metabo- 
lism? Proc Natl Acad Sci USA 95:8170-8174. 

Bock KW, Forster A, Gschaidmeier H. Bruck M. Munzel P. Schareck W, Foumel-Gigleux S, and 
Burchell B (1993) Paracetamol glucuronidation by recombinant rat and human phenol UDP- 
glucuronosyltransferases. Biochem Pharmacol 45:1809-1814. 

Bosma PJ, Roy Chowdhury J. and Bakker C (1995) The genetic basis of the reduced expression 
of bilirubin UDP-glucuronosyltransferase 1 in Gilbert's syndrome. New Engl J Med 333: 
1171-1175. 

Bosma PJ, Seppen J. Goldhoorn B, Bakker C, Oude Elferink RP, Roy Chowdhury J, Roy 
Chowdhury N, and Jansen PL (1994) Bilirubin UDP-glucuronosyltransferase 1 is the only 
relevant bilirubin glucuronidating isoform in man. J Biol Chem 269:17960-17964. 

Ciotti M, Marrone A, Potter C, and Owens IS (1997) Genetic polymorphism in the human 
UGT1 A6 (planar phenol) UDP-glucuronosyltransferase: pharmacological implications. Phar- 
macogenetics 7:485-495. 

Coffman BL, King CD, Rios GR. and Tephly TR (1998) The glucuronidation of opioids, other 
xenobiotics. and androgens by human UGT2B7Y(268) and UGT2B7H(268). Drug Metab 
Dispos 26:73-77. 

Court MH, Duan SX, von Moltke LL. Greenblatt DJ, Patten CJ, Miners JO, and Mackenzie PI 
(2001) In terindi vidua] variability in acetaminophen glucuronidation by human liver micro- 
somes: identification of relevant acetaminophen UDP-glucuronosyltransferase isoforms. 
J Pharmacol Exp Ther 299:998-1006. 

Davies MH and Schnell RC (1991) Oltipraz- induced amelioration of acetaminophen hepatotox- 
icity in hamsters. II. Competitive shunt in metabolism via glucuronidation. Toxicol Appl 
Pharmacol 109:29-40. 

de Morais SMF, Chow SYM. and Wells PG (1992a) Biotransformation and toxicity of acet- 
aminophen in congenic RHA rats with or without a hereditary deficiency in UDP- 
glucuronosyltransferase. Toxicol Appl Pharmacol 117:81-87. 

de Morais SMF, Uetrecht JP. and Wells PG (1992b) Decreased glucuronidation and increased 
bioactivation of acetaminophen in Gilbert's syndrome. Gastroenterology 102:577-586. 

de Morais SMF and Wells PG (1988) Deficiency in bilirubin UDP-glucuronyl transferase as a 
genetic determinant of acetaminophen toxicity. J Pharmacol Exp Ther 247:323-331. 

de Morais SMF and Wells PG (1989) Enhanced acetaminophen toxicity in rats with bilirubin 
UDP-glucuronyl transferase deficiency. Hepatotogy 10:163-167. 

Ebner T, Remmel RP. and Burchell B (1993) Human bilirubin UDP-glucuronosyltransferase 
catalyzes the glucuronidation of ethinylestradiol. Mol Pharmacol 43:649-654. 

Esteban A and Perez-Mateo M (1999) Heterogeneity of paracetamol metabolism in Gilbert's 
syndrome. Eur J Drug Metab Pharmacokinet 24:9-13. 

Fang JL. Beland FA, Doerge DR, Wiener D, Guillemette C, Marques MM, and Lazarus P (2002) 
Characterization of benzo(a)pyrene-trans-7,8-dihydrodiol glucuronidation by human tissue 
microsomes and overexpressed UDP-glucuronosyltransferase enzymes. Cancer Res 62:1978- 
1986. 

Fox IJ, Roy Chowdhury J, Kaufman SS, Goertzen TC, Chowdhury NR. Warkentin PI, Dorko K, 
Sauter BV, and Strom SC (1998) Treatment of the Crigler-Najjar syndrome type 1 with 
hepatocyte transplantation. N Engl J Med 338:1422-1426. 

Gestl SA, Green MD, Shearer DA, Frauenhoffer E, Tephly TR, and J Weisz (2002) Expression 
of UGT2B7, a UDP-glucuronosyltransferase implicated in the metabolism of 4-hy- 
droxy estrone and all-trans retinoic acid, in normal human breast parenchyma and in invasive 
and in situ breast cancers. Am J Pathol 160:1467-1479. 

Girard H, Journault K, and Guillemette C (2003) Haplotypic structure of the carcinogen- 
metabolizing enzyme UGT1A7: nine polymorphic alleles (*1 through *9). Abstract 7485. in 
ASPET, Experimental Biology 2003, April 11-15, 2003, San Diego. CA. 

Gong QH, Cho JW. Huang T, Potter C, Gholami N, Basu NK. Kubota S, Carvalho S, Pennington 
MW, Owens IS, and Popescu NC (2001) Thirteen UDPglucuronosy I transferase genes are 
encoded at the human UGT1 gene complex locus. Pharmacogenetics 11:357-368. 

Gregory PA, Hansen AJ, and Mackenzie PI (2000) Tissue specific differences in the regulation 
of the UDP glucuronosyl transferase 2B17 gene promoter. Pharmacogenetics 10:809-820. 

Grove AD, Kessler FK. Metz RP. and Ritter JK (1997) Identification of a rat oltipraz- inducible 
UDP-glucuronosyltransferase (UGT1A7) with activity towards benzo(a)pyrene-7,8- 
dihydrodiol. J Biol Chem 272:1621-1627. 

Grove AD. Llewellyn GC. Kessler FK. While KL, Crespi CL, and Ritter JK (2000) Differential 
protection by rat UDP-glucuronosyltransferase 1 A7 against benzol a]pyrene-3.6-quinone- ver- 
sus benzol a]pyrene-induced cytotoxic effects in human lymphoblastoid cells. Toxicol Appl 
Pharmacol 162:34-43. 

Gsur A, Preyer M, Haidinger G. Schatzl G. Madersbacher S, Marberger M, Vutuc C. and 
Micksche M (2002) A polymorphism in the UDP-glucuronosyltransferase 2B15 gene (D85Y) 
is not associated with prostate cancer risk. Cancer Epidemiol Biomarlters Prev 11:497-498. 

Guha C. Parashar B. Deb NJ. Garg M. Gorla GR, Singh A. Roy-Chowdhury N, Vikram B, and 
Roy-Chowdhury J (2002) Long-term normalization of serum bilirubin levels by massive 



repopulation of Gunn rat liver by normal hepatocytes, transplanted after preparative hepatic 
irradiation and partial hepatectomy. Hepatology 36:354-362. 
Guha C. Shanna A. Gupta S. Alfieri A, Gorla GR. Gagandeep S. Sokhi R, Roy-Chowdhury N. 
Tanaka KE. Vikram B. and Roy-Chowdhury J (1999) Amelioration of radiation-induced liver 
damage in partially hepatectomized rats by hepatocyte transplantation. Cancer Res 59:5871- 
5874. 

Guillemette C. De Vivo I. Hankinson SE, Haiman CA. Spiegelman D, Housman DE. and Hunter 
DJ (2001) Association of genetic polymorphisms in UGT1A1 with breast cancer and plasma 
hormone levels. Cancer Epidemiol Biomarkers Prev 10:71 1-714. 

Guillemette C, Levesque E. Beaulieu M. Turgeon D. Hum DW. and Belanger A (1997) 
Differential regulation of two uridine diphospho-glucuronosyl transferases, UGT2B15 and 
UGT2B17. in human prostate LNCaP cells. Endocrinology 138:2998-3005. 

Guillemette C, Millikan R, Newman B, and Housman DE (2000a) Genetic polymorphisms in 
UGT1A1 and association with breast cancer among African Americans. Cancer Res 60:950- 
956. 

Guillemette C. Ritter JK, Auyeung DJ. Kessler FK, and Housman DE (2000b) Structural 

heterogeneity at the UDP-glucuronosyltransferase 1A locus: functional consequences of three 

novel missense mutations in the human UGT1A7 gene. Pharmacogenetics 10:629- 644. 
Guillemette CG (2003) Pharmacogenomics of human UDP-glucuronosyltransferase enzymes. 

Pharmacogenom J 3:136-158. 
Gupta S, Aragona E, Vemuru RP, Bhargava KK. Burk RD. and Chowdhury JR (1991) Permanent 

engraftment and function of hepatocytes delivered to the liver: implications for gene therapy 

and liver repopulation. Hepatology 14:144-149. 
Haiman CA. Hankinson SE, De Vivo I, Guillemette C, Ishibe N, Hunter DJ, and Byrne C (2003) 

Polymorphisms in steroid hormone pathway genes and mammographic density. Breast Cancer 

Res Treat 77:27-36. 

Hall D. Ybazeta G. Destro-Bisol G, Petzl-Erler ML. and Di Rienzo A (1999) Variability at the 
uridine diphosphate glucuronosyltransferase 1A1 promoter in human populations and pri- 
mates. Pharmacogenetics 9:591-599. 

Hansen AJ, Lee YH. Gonzalez FJ, and Mackenzie PI (1997) HNF1 alpha activates the rat UDP 
glucuronosyltransferase UGT2BI gene promoter. DNA Cell Biol 16:207-214. 

Hansen A J, Lee YH. Sterneck E, Gonzalez FJ, and Mackenzie PI (1998) C/EBPalpha is a 
regulator of the UDP glucuronosyltransferase UGT2B1 gene. Mol Pharmacol 53: 1027-1033. 

Haque SJ. Petersen DD, Nebert DW, and Mackenzie PI (1991) Isolation, sequence and devel- 
opmental expression of rat UGT2B2: the gene encoding a constitutive UDP glucuronosyl- 
transferase that metabolizes etiocholanolone and androsterone. DNA Cell Biol 10:515-524. 

Hu Z and Wells PG (1992) In vitro and in vivo biotransformation and covalent binding of 
benzo(a)pyrene in rats with a genetic deficiency in bilirubin UDP-glucuronosyltransferase. 
J Pharmacol Exp Ther 263:334-342. 

Hu Z and Wells PG (1993) Human interindividual variation in lymphocyte UDP- 
glucuronosy 1 transferases as a determinant of benzol ajpyrene covalent binding and cytotoxic- 
ity. ISSXProc 4:241. 

Hu Z and Wells PG (1994) Modulation of benzol ajpyrene bioactivation and cytotoxicity by 
glucuronidation in lymphocytes and hepatic microsomes from rats with a hereditary deficiency 
in bilirubin UDP-glucuronosyltransferase. Toxicol Appl Pharmacol 127:306-313. 

Huang YH, Galijatovic A, Nguyen N, Geske D, Beaton D, Green J, Green M, Peters WH, and 
Tukey RH (2002) Identification and functional characterization of UDP-glucuronosyl trans- 
ferases UGT1A8*1, UGT1A8*2 and UGT1A8*3. Pharmacogenetics 12:287-297. 

Ikushiro S, Emi Y, and Iyanagi T (1995) Identification and analysis of drug-responsive expres- 
sion of UDP-glucuronosyltransferase family 1 (UGTI) isozyme in rat hepatic microsomes 
using anti-peptide antibodies. Arch Biochem Biophys 324:267-272. 

Han Y, Attavar P. Takahashi M, Davidson A, Horwitz M, Guida J, Roy Chowdhury N, and Roy 
Chowdhury J (1996) Induction of central tolerance by intrathymic inoculation of adenoviral 
antigens into the host thymus permits long-term gene therapy in Gunn rats. J Clin Investig 
98:2640-2647. 

Han Y, Prakash R, Davidson A. Jona V, Droguett G, Horwitz MS, Roy Chowdhury N, and Roy 
Chowdhury J (1997) Oral tolerization to adenoviral antigens permits long-term gene expres- 
sion using recombinant adenoviral vectors. J Clin Investig 99:1098-1106. 

Ishii Y, Hansen AJ, and Mackenzie PI (2000) Octamer transcription factor- 1 enhances hepatic 
nuclear factor- 1 alpha- mediated activation of the human UDP glucuronosyltransferase 2B7 
promoter. Mol Pharmacol 57:940-947. 

Iyanagi T, Emi Y. and Ikushiro S (1998) Biochemical and molecular aspects of genetic disorders 
of bilirubin metabolism. Biochim Biophys Acta 1407:173-174. 

Iyer L, King CD. Whitington PF, Green MD, Roy SK, Tephly TR. Coffman BL. and Ratain MJ 
(1998) Genetic predisposition to the metabolism of irinotecan (CPT-11). Role of uridine 
diphosphate glucuronosyltransferase isoform 1A1 in the glucuronidation of its active metab- 
olite (SN-38) in human liver microsomes. J Clin Investig 101:847-854. 

lzsvak Z. Ivies Z, and Plasterk RH (2000) Sleeping Beauty, a wide host-range transposes vector 
for genetic transformation in vertebrates. J Mol Biol 302:93-102. 

Jinno H, Saeki M. Saito Y. Tanaka-Kagawa T. Hanioka N, Sai K. Kaniwa N, Ando M. Shirao 
K, Minami H, et al. (2003a) Functional characterization of human UDP- glucuronosyltrans- 
ferase (UGT) 1 A9 variant, D256N, found in Japanese cancer patients. J Pharmacol Exp Ther 
306:688-693- 

Jinno H. Saeki M, Tanaka-Kagawa T, Hanioka N, Saito Y, Ozawa S, Ando M. Shirao K, Minami 
H. Ohtsu A. et al. (2003b) Functional characterization of wild-type and variant (T202I and 
M591) human UDP-glucuronosyltransferase 1A10. Drug Metab Dispos 31:528-532. 

Kessler FK, Kessler MR. Auyeung DJ. and Ritter JK (2002) Glucuronidation of acetaminophen 
catalyzed by multiple rat phenol UDP- glucuronosyl transferases. Drug Metab Dispos 30:324- 
330. 

Kessler FK and Ritter JK (1997) Induction of a rat liver benzol a )pyrene-trans-7 t 8-dihydrodiol 
glucuronidating activity by oltipraz and beta-naphthoflavone. Carcinogenesis 18:107-114. 

Kim PM, DeBoni U. and Wells PG (1997a) Peroxidasc-dependent bioactivation and oxidation of 
DNA and protein in benzol a Ipyrene-initiated micronucleus formation. Free Radic Biol Med 
23:579-596. 

Kim PM and Wells PG (1996a) Genoprotection by UDP-glucuronosyl transferases in peroxidase- 
de pendent, reactive oxygen species- mediated micronucleus initiation by the carcinogens 
4-(methylnitrosamino)-l-(3-pyridyl)-l-butanone (NNK) and benzol ajpyrene. Cancer Res 56: 
1526-1532. 

Kim PM and Wells PG (1996b) Phenytoin-initiated hydroxyl radical formation: characterization 
by enhanced salicylate hydroxy I ation. Mol Pharmacol 49:172-181. 



290 



WELLS ETAL 



Kim PM and Wells PG (1998) Phenytoin embryotoxicity: protection by UDP-glucuronosyl trans- 
ferases. Toxicol Sci 42(1-S): 261 (No. 1288). 

Kim PM, Winn LM Parman T, and Wells PG (1997b) UDP-glucuronosyl trans ferase-mediated 
protection against in vitro DNA oxidation and micronucleus formation initiated by phenytoin 
and its embryotoxic metabolite 5-(j>-hydroxyphenyl)-5-phenylhydantoin (HPPH). J Pharma- 
col Exp Ther 280:200-209. 

Kobayashi T. Yokota H, Ohgiya S, Iwano H, and Yuasa A (1998) UDP-glucuronosyltransferase 
UGT1A7 induced in rat small intestinal mucosa by oral administration of 2-naphthoflavone. 
Eur J Biochem 258:948-955. 

Kren B, Parashar B, Bandopadhyay P, Roy Chowdhury N. Roy Chowdhury J, and Steer CJ 
(1999) Correction of the UDP-glucuronosyltransferase gene defect in the Gunn rat model of 
Crigler-Naijar syndrome type I. Proc Natl Acad Sci USA 96:10349-10354. 

Lampe JW, Bigler J, Bush AC, and Potter JD (2000). Prevalence of polymorphisms in the human 
UDP-glucuronosyltransferase 2B family: UGT2B4(D458E), UGT2B7(H268Y), and 
UGT2B15(D85Y). Cancer Epidemiol Biomarkers Prev 9:329-333. 

Lampe JW, Bigler J, Horner NK, and Potter JD (1999) UDP-glucuronosyltransferase 
(UGT1A1*28 and UGT1A6*2) polymorphisms in Caucasians and Asians: relationships to 
serum bilirubin concentrations. Pharmacogenetics 9:341-349. 

Lee YH, Sauer B, Johnson PF, and Gonzalez FJ (1997) Disruption of the C/Ebp-alpha gene in 
adult mouse liver. Mol Cell Biol 17:6014-6022. 

Levesque E, Beaulieu M, Green MD. Tephly TR, Belanger A, and Hum DW (1997) Isolation and 
characterization of UGT2B15(Y85): a UDP-glucuronosyltransferase encoded by a polymor- 
phic gene. Pharmacogenetics 7:317-325. 

Levesque E, Beaulieu M, Hum DW, and Belanger A (1999) Characterization and substrate 
specificity of UGT2B4 (E458): a UDP-glucuronosyltransferase encoded by a polymorphic 
gene. Pharmacogenetics 9:207-216. 

Levesque EE, Turgeon D, Carrier JS, Montminy V, Beaulieu M, and Belanger A (2001) Isolation 
and characterization of the UGT2B28 cDNA encoding a novel human steroid conjugating 
UDP-glucuronosyltransferase. Biochemistry 40:3869-3881. 

Lichtsteiner S, Wuarin J, and Schibler U (1987) The interplay of DNA-binding proteins on the 
promoter of the mouse albumin gene. Cell 51:963-973. 

Mackenzie PI, Owens IS, Burchell B, Bock KW, Bairoch A, Belanger A, Fournel-Gigleux S, 
Green M, Hum DW, Iyanagi T, et al. (1997) The UDP glycosyltransferase gene superfamily: 
recommended nomenclature update based on evolutionary divergence. Pharmacogenetics 
7:255-269. 

Mackenzie PI and Rodboum L (1990) Organization of the rat UDP-glucuronosyltransferase, 
UDPGTr-2, gene and characterization of its promoter. J Biol Chem 265:11328-11332. 

MacLeod SL, Nowell S, Plaxco J, and Lang NP (2000) An allele- specific polymerase chain 
reaction method for the determination of the D85Y polymorphism in the human UDP- 
glucuronosyltransferase 2B15 gene in a case-control study of prostate cancer. Ann Surg Oncol 
7:777-782. 

Maruo Y, Nishizawa K. Sato H, Doida Y, and Shimada M (1999) Association of neonatal 
hyperbilirubinemia with bilirubin UDP-glucuronosyltransferase polymorphism. Pediatrics 
103:1224-1227. 

Mojarrabi B and Mackenzie PI (1998) Characterization of two UDP glucuronosyltransferases 
that are predominantly expressed in human colon. Biochem Biophxs Res Commun 247:704- 
709. 

Monaghan G, Burchell B, and Boxer M (1997) Structure of the human Ugt2b4 gene encoding a 
bile acid UDP-glucuronosyltransferase. Mamm Genome 8:692-694. 

Nakai H, Yant SR, Storm TA, Fuess S, Meuse L, and Kay MA (2001) Extrachromosomal 
recombinant adeno-associated virus vector genomes are primarily responsible for stable liver 
transduction in vivo. J Virol 75:6969-6976. 

Owens IS and Ritter JK (1992) The novel bilirubin/phenol UDP-glucuronosyltransferase UGT1 
gene locus: implications for multiple nonhemolytic familial hyperbilirubinemia phenotypes. 
Pharmacogenetics 2:93-108. 

Radominska-Pandya A, Czernik PJ, Little JM, Battaglia E, and Mackenzie PI (1999) Structural 
and functional studies of UDP-glucuronosyltransferases. Drug Metab Rev 31:817-899. 

Riedy M, Wang JY, Miller AP, Buckler A, Hall J, and Guida M (2000) Genomic organization 
of the UGT2b gene cluster on human chromosome 4ql3. Pharmacogenetics 10:251-260. 

Ritter JK, Chen F, Sheen YY, Tran HM, Kimura S, Yeatman MT, and Owens IS (1992) A novel 
complex locus UGT1 encodes human bilirubin, phenol and other UDP-glucuronosyltrans- 
ferase isozymes with identical carboxyl termini. J Biol Chem 267:3257-3261. 

Ritter JK, Kessler FK, Thompson MT, Grove AD, Auyeung DJ, and Fisher RA (1999) Expres- 
sion and indue ibiliry of the human bilirubin UDP-glucuronosyltransferase UGT1A1 in liver 
and cultured primary hepaiocytes: evidence for both genetic and environmental influences. 
Hepatology 30:476-484. 

Roy Chowdhury J, Grossman M, Gupta S, Roy Chowdhury N, Baker JR Jr, and Wilson JM 
(1991) Long-term improvement of hypercholesterolemia after ex vivo gene therapy in LDL- 
receptor deficient rabbits. Science (Wash DC) 254:1802-1805. 

Roy Chowdhury N, Hays RM, Bommineni VR, Franki N, Roy Chowdhury J, Wu CH, and Wu 



GY (1996) Microtubular distribution prolongs the expression of human bilirubin- 
uridinediphosphoglucuronate-glucuronosyltransferase-l gene transferred into Gunn rat livers 
J Biol Chem 271:2341-2346. 

Roy Chowdhury J, Roy Chowdhury N. Strom SC, Kaufman SS, Horslen S, and Fox I J (1998) 
Hepatocyte transplantation in huaaans: gene therapy and more. Pediatrics 102:647- 648. 

Roy Chowdhury J, Wolkoff AW. Roy Chowdhury N, and Arias IM (2001) Hereditary jaundice 
and disorders of bilirubin metabolism, in The Metabolic and Molecular Basis of Inherited 
Disease, 8th ed (Scriver CR. Boudet AL, Sly W, Valle D. Childs B, Kinzler K, and Vogelstein 
B, eds) pp 3063-3101, McGraw-Hill, New York. 

Sauter BV, Parashar B, Roy Chowdhury N. Kadakol A. Han Y, Singh H. Milano J. Strayer DS. 
and Roy Chowdhury J (2000) Gene transfer to the liver using a replication-deficient recom- 
binant SV40 vector results in long-term amelioration of jaundice in Gunn rats. Gastroenter- 
ology 119:1348-1357. 

Senafi SB, Clarke DJ, and Burchell B (1994) Investigation of the substrate specificity of a cloned 
expressed human bilirubin UDP-glucuronosyltransferase: UDP-sugar specificity and involve- 
ment in steroid and xenobiotic glucuronidation. Biochem J 303:233-240. 

Strassburg CP. Nguyen N, Manns MP, and Tukey RH (1998) Polymorphic expression of the 
UDP-glucuronosyltransferase UGT1A gene locus in human gastric epithelium. Mol Pharma- 
col 54:647-654. 

Strassburg CP, Vogel A, Kneip S, Tukey RH, and Manns MP (2002) Polymorphisms of the 
human UDP-glucuronosyltransferase (UGT) 1 A7 gene in colorectal cancer. Gut 50:851-856. 

Strayer DS, Branco F, Zern MA, Yam P, Calarota SA, Nichols CN, Zaia JA, Rossi J. Li H, 
Parashar B, et al. (2002) Durability of transgene expression and vector integration: recombi- 
nant SV40-derived gene therapy vectors. Mol Ther 6:227-237. 

Tada K, Roy Chowdhury N, Prasad VR, Kim B-H, KaJapudi M. Fox 1J, Duijvandijk P, Bosma 
PJ, and Roy Chowdhury J (1998) Long-term amelioration of bilirubin glucuronidation defect 
in Gunn rats by transplanting genetically modified immortalized autologous hepatocytes. Cell 
Transplant 7:607-616. 

Takahashi M, Ilan Y, Roy Chowdhury N. Guida J, Horwitz MS. and Roy Chowdhury J (1996) 
Long-term correction of bilirubin UDP-glucuronosyltransferase deficiency in Gunn rats by 
administration of a recombinant adenovirus during the neonatal period. J Biol Chem 271: 
26536-26542. 

Thummala NR, Ghosh SS, Lee SW, Reddy B, Davidson A, Horwitz MS. Roy Chowdhury J, and 
Roy Chowdhury N (2002) A non-immunogenic adenoviral vector, coexpressing CTLA4Ig and 
bilirubin-uridinediphosphoglucuronateglucuronosyltransferase permits long-term, repeatable 
transgene expression in the Gunn rat model of Crigler- Najjar syndrome. Gene Ther 9:981- 
990. 

Tukey RH and Strassburg CP (2000) Human UDP-glucuronosyltransferases: metabolism, ex- 
pression and disease. Annu Rev Pharmacol Toxicol 40:581-616. 

Turgeon D. Carrier JS, Levesque E, Beatty BG, Belanger A. and Hum DW (2000) Isolation and 
characterization of the human UGT2B15 gene, localized within a cluster of UGT2B genes and 
pseudogenes on chromosome 4. J Mol Biol 295:489-504. 

Ullrich D, Sieg A, Blume R, Bock KW, Schroter W, and Bircher J (1987) Normal pathways for 
glucuronidation, sulphation and oxidation of paracetamol in Gilbert' s syndrome. Eur J Clin 
Investig 17:237-240. 

Vienneau DS, DeBoni U, and Wells PG (1995) Potential genoprotective role for UDP- 
glucuronosyltransferases (UGTs) in chemical carcinogenesis: initiation of micronuclei by 
benzo[a]pyrene and benzol el pyrene in UGT-deficient cultured rat skin fibroblasts. Cancer Res 
55:1045-1051. 

Villeneuve L, Girard H. Fortier LC, and Guillemette C (2003) Novel functional polymorphisms 
in the UGT1A7 and UGT1A9 glucuronidating enzymes in Caucasian and African- American 
subjects and their impact on the metabolism of 7-ethyl-10-hydroxycamptothecin and flavopiri- 
dol anticancer drugs. J Pharmacol Exp Ther 307:] 17-128. 

Vogel A, Kneip S, Barut A, Ehtner U, Tukey RH, Manns MP, and Strassburg CP (2001) Genetic 
link of hepatocellular carcinoma with polymorphisms of the UDP-glucuronosyltransferase 
UGT1A7 gene. Gastroenterology 121:1136-1144. 

Wells PG, Kim PM, Nicol CJ, Parman T, and Winn LM (1997) Chapter 17: Reactive interme- 
diates, in Handbook of Experimental Pharmacology, Part I: Drug Toxicity in Embryonic 
Development (Kavlock RJ and Daston GP eds.), vol 124, pp 453-518, Springer- Verlag, 
Heidelberg. 

Wells PG, Obilo FC. and de Morais SMF (1989) Benzo(a) pyrene embryopathy in rats genetically 
deficient in bilirubin UDP-glucuronyl transferase. FASEB J 3: A 1025. 

Wells PG and Winn LM (1996) Biochemical toxicology of chemical teratogenesis. Crit Rev 
Biochem Mol Biol 31:1-40. 

Woolley AT, Guillemette C, Cheung CL. Housman DE, and Lieber CM (2000) Direct haplo- 
typing of kilobase-size DNA using carbon nanotube probes. Nat Biotechnol 18:760-763. 

Zheng Z, Park JY, Guillemette C, Schantz SP, and Lazarus P (2001) Tobacco carcinogen- 
detoxifying enzyme UGT1 A7 and its association with orolaryngeal cancer risk. J Natl Cancer 
Inst 93:1411-1418. 



fence Viewer 



http://www.ncbi.nlm.nih.gov/entrez/viewer.fcgi?db=protein&db=prot 

Exhibit A 

with Response dated 4/26/04 




• O O O • ~*' ! O v N : 0 '' >J - 'J' 




In USSN: 09/980,729 



Entrez 



PubMed 



Nucleotide 



Protein 



Genome 



Stsucture 



Taxonomy 



Books 



Search Protein 



IB for | 

Limits 



Preview/Index 



Show: 20 



History 



Clipboard 
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LOCUS 

DEFINITION 
ACCESSION 
VERSION 
DBSOURCE 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 

JOURNAL 
PUBMED 
COMMENT 

FEATURES 

source 



Craniata; Vertebrata; Euteleostomi ; 
Catarrhini; Hominidae; Homo. 



ORIGIN 



NP__001066 528 aa linear PRI 23-DEC-2003 

UDP glycosyltransf erase 2 family, polypeptide BIO [Homo sapiens] . 
NP_001066 

NP_001066.1 GI: 4507817 
REFSEQ: accession NM_001075 .2 

Homo sapiens ( human ) 
Homo sapiens 

Eukaryota; Metazoa; Chorda ta; 
Mammalia; Eutheria; Primates; 
1 

Jin,C.J., Miners, J. O., Li 1 lywhi te , K . J . and Mackenzie, P . I . 
cDNA cloning and expression of two new members of the human liver 
UDP-glucuronosyl trans f erase 2B subfamily 
Biochem. Biophys . Res. Commun. 194 (1), 496-503 (1993) 
8333863 

PROVISIONAL REFSEQ : This record has not yet been subject to final 
NCBI review. The reference sequence was derived from X63359 . 1 . 
Location/Qualifiers 
1. . 528 

/organism="Homo sapiens" 
/ db_xre f = " taxon : 9 6 0 6 " 
/ chromo s ome = " 4 " 
/map="4ql3.3" 
1. . 528 

/product="UDP glycosyltransf erase 2 family, polypeptide 
B10" 

/EC_number= " 2.4.1.17 " 
23. .526 

/region_name= "UDP-glucoronosyl and UDP-glucosyl 
transferase" 
/note=" UDPGT " 
/db_xref="CDD: 22944" 
283 

/replace="P" 
/replace="A" 

/db_xref = " dbSNP : 1976666 " 
382 

/replace=" A" 
/replace="T" 

/db_xref=" dbSNP : 4095564 " 
1..528 

/gene="UGT2B10" 

/coded_by= "NM_001075 .2:11.. 1597 " 

/note="go_component : microsome [goid 0005792] [evidence 
IEA] ; 

go_component : integral to membrane [goid 0016021] 
[evidence IEA] ; 

go_function: UDP-glucuronosyltransf erase [goid 0003981] 
[evidence E] [pmid 8333863]; 

go_function: glucuronosyltransf erase activity [goid 
0015020] [evidence IEA] ; 

go_process: lipid metabolism [goid 0006629] [evidence TAS] 
[pmid 8333863] " 
/db_xref = " GenelD : 7365 " 
/db_xref=" LocusID: 7365 " 
/db_xref="MIM: 600070 " 

1 malkwttvll iqlsfyfssg scgkvlvwaa eyslwmnmkt ilkelvqrgh evtvlassas 

61 ilfdpndsst lklevyptsl tktefeniim qlvkrlseiq kdtfwlpfsq eqeilwaind 

121 iirnfckdw snkklmkklq esrfdivfad aylpcgella elfnipfvys hsfspgysfe 

181 rhsggfifpp syvpwmskl sdqmtfmerv knmlyvlyfd fwfqifnmkk wdqfysevlg 
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241 rpttlsetmr kadiwlmrns wnfkfphpfl pnvdfvgglh ckpakplpke meefvqssge 
301 ngvwfslgs mvsnmteera nviatalaki pqkvlwrfdg nkpdalglnt rlykwipqnd 
361 llghpktraf ithggangiy eaiyhgipmv giplffdqpd niahmkakga avrvdfntms 
421 stdllnalkt vindpsyken imklsriqhd qpvkpldrav fwiefvmrhk gakhlrvaah 
481 nltwfqyhsl dvigfllacv atvlfiitkc clfcfwkfar kgkkgkrd 
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□ 2912330CD1 

□ g4507817 



CLUSTAL W (1.7) Multiple Sequence Alignments 



Sequence format i s Pearson 
Sequence 1: 2912330CD1 529 aa 

Sequence 2: g4507817 528 aa 

Start of Pairwise alignments 
Aligning. . . 

Sequences (1:2) Aligned. Score: 80 
Start of Multiple Alignment 
There are 1 groups 
Aligning. . . 

Group 1: Sequences: 2 Score: 647 0 

Alignment Score 27 61 

CLUSTAL -Alignment file created [baaubaiwy . aln] 
CLUSTAL w (1.7) multiple sequence alignment 



29123 3 0CD1 MSMKWTSALLLIQLSCYFSSGSCGKVLVWPTEFSHWMNIKTILDELVQRGHEVTVLASSA 

g4507817 MALKWT-TVLLIQLSFYFSSGSCGKVLWAAEYSLVMNMKTILKELVQRGHE\m^LASSA 
*..*** ..***★** ************* .*.* ***.**** **************** 

29123 3 0CD1 SISFDPNSPSTLKFEVYPVSLTKTEFEDIIKQLVKRWAELPKDTFWSYFSQVQEIMWTFN 

g4507 817 SILFDPNDSSTLKLEVYPTSLTKTEFENIIMQLVKRLSEIQKDTFWLPFSQEQEILWAIN 
** **** ****.**** ********.** ***** .*. ***** *** ***.*..* 

29123 3 0CD1 DILRKFCKDIVSNKKLMKKLQESRFDWLADAVFPFGELLAELLKIPFVYSLRFSPGYAI 

g4 5 0 7 8 1 7 DI IRNFCKDWSNKKLMKKLQESRFDI VFADAYLPCGELLAELFNI PFVYSHSFS PGYSF 

**.*.****.************★***.*.*** .* *******..****** *****.. 

29123 3 0CD1 EKHSGGLLFPPSWPVVMSELSDQMTFIERVKNMIYVLYFEFWFQIFDMKKWDQFYSEVL 

g4507 817 ERHSGGFIFPPSWPVVMSKLSDQMTFMERVKNMLWLYFDFWFQIFNMKKWDQFYSEVL 
* .****. .*********** .******* .****** .***** .****** . ************ 

29123 3 0CD1 GRPTTLSETMAKADIWLIRNYWDFQFPHPLLPNVEFVGGLHCKPAKPLPKEMEEFVQSSG 

g4507 817 GRPTTLSETMRKADIWLMRNSWNFKFPHPFLPNVDFVGGLHCKPAKPLPKEMEEFVQSSG 
********** ******.** *.*.****.****.************************* 



2912330CD1 ENGVWFSLGSMVSNTSEERANVIASALAKI PQKVLWRFDGNKPDTLGLNTRLYKWI PQN 

g4 5 0 7 8 1 7 ENGWVFSLGSMVSNMTEERANVIATALAKI PQKVLWRFDGNKPDALGLNTRLYKWI PQN 
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*************** .********.*******************.************** 

291233 0CD1 DLLGH PKTKAF ITHGGMNGI YEAI YHGVPMVGVPI FGDQLDNI AHMKAKGAAVEINFKTM 

g4 5 0 7 8 1 7 DLLGH PKTRAF ITHGGANGI YEAI YHGI PMVGI PLFFDQPDNI AHMKAKGAAVRVDFNTM 

********.******* **********.****.*.* ** ************* ..*.** 

291233 0CD1 TSEDLLRALRTVITDSSYKENAMRLSRIHHDQPVKPLDRAVFWIEFVMRHKGAKHLRSAA 

g4 5 0 7 8 1 7 S STDLLNALKTVINDPS YKENIMKLSRIQHDQPVKPLDRAVFWIEFVMRHKGAKHLRVAA 

.* *** **.★** ★ ***** *.****.**************************** ** 



291233 OCD1 HDLTWFQHYS IDVIGFLLTCVATAI FLFTKCFLFSCQKFNKTRKIEKRE 

g4 5 0 7 8 1 7 HNLTWFQYHSLDVIGFLLACVATVLFI ITKCCLFCFWKFARKGKKGKRD 

*.*****..*.*******.**** .★,.*** ** ** . * **. 
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r 1: P36537 . UDP-glucuronosylt...[gi:549155] 



KEYWORDS 

SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 

JOURNAL 
MEDLINE 
PUBMED 
REMARK 

COMMENT 



LOCUS P36537 528 aa linear PRI 15-JUN-2002 

DEFINITION UDP-glucuronosyltransf erase 2B10 precursor, microsomal (UDPGT) . 
ACCESSION P36537 
VERSION P36537 GI: 549155 

DBSOURCE swissprot: locus UDBA_HUMAN , accession P36537; 
class: standard, 
created: Jun 1, 1994. 
sequence updated: Jun 1, 1994. 
annotation updated: Jun 15, 2002. 
xrefs: gi: 516149 , gi : 516150 , gi : 484384 

xrefs (non-sequence databases): MIM 600070 , InterProIPR002213 , 
PfamPF00201, PROSITEPS00375 

Transferase; Glycosyl transferase; Glycoprotein; Transmembrane; 
Signal; Multigene family; Microsome. 
Homo s ap i en s { human } 
Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 
1 (residues 1 to 528) 

Jin, C.J./ Miners, J. 0., Lillywhite, K.J. and Mackenzie, P. I . 
cDNA cloning and expression of two new members of the human liver 
UDP-glucuronosyltransf erase 2B subfamily 
Biochem. Biophys . Res. Commun. 194 (1), 496-503 (1993) 
93326164 
8333863 

SEQUENCE FROM N. A. 
TISSUE^Liver 

This SWISS- PROT entry is copyright. It is produced through a 
collaboration between the Swiss Institute of Bioinf ormatics and 
the EMBL outstation - the European Bioinf ormatics Institute. 
The original entry is available from http : / /www. expasy . ch/sprot 
and http : / /www. ebi . ac . uk/sprot 

[FUNCTION] UDPGT IS OF MAJOR IMPORTANCE IN THE CONJUGATION AND 
SUBSEQUENT ELIMINATION OF POTENTIALLY TOXIC XENOBIOTICS AND 
ENDOGENOUS COMPOUNDS. 

[CATALYTIC ACTIVITY] UDP-glucuronate + acceptor = UDP + acceptor 
beta-D-glucuronoside . 
[SUBCELLULAR LOCATION] Microsomal. 

[SIMILARITY] BELONGS TO THE UDP-GLYCOSYLTRANSFERASE FAMILY. 
Location/Qualifiers 
1. .528 

/organism="Homo sapiens" 
/db_xref ="taxon: 9606" 
1. .528 

/gene="UGT2B10" 
Protein 1. . 528 

/gene="UGT2B10" 

/product=" UDP-glucuronosyltransf erase 2B10 precursor, 
microsomal" 
/ EC_number = " 2.4.1.17 " 
Region 1 . . 23 

/gene=" UGT2B10 " 
/ region_name= " Signal " 
/note="BY SIMILARITY." 
Region 24 . . 528 

/gene="UGT2Bl0" 
/region_name=" Mature chain" 

/note= " UDP-GLUCURONOSYLTRANSFERASE 2B10. 0 
Site 66 

/gene=* , UGT2B10 ,, 
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source 



gene 
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Site 



Region 



ORIGIN 



/ s i te_type= " glycosylat ion M 

/note=" N-LINKED (GLCNAC . . . ) ( POTENTIAL) . " 
314 

/gene=" 1^2810" 
/site_type="glycosylation" 

/note= "N-LINKED (GLCNAC...) (POTENTIAL)." 
481 

/gene="UGT2BlO" 
/site_type=" glycosylat ion" 

/note= "N-LINKED (GLCNAC...) (POTENTIAL)." 

492. .512 

/gene="UGT2B10" 

/ region_name= " Transmembrane region " 
/note= tt POTENTIAL. " 



// 



1 malkwttvll 

61 ilfdpndsst 

121 iirnfckdw 

181 rhsggfifpp 

241 rpttlsetmr 

301 ngvwfslgs 

361 llghpktraf 

421 stdllnalkt 

481 nltwfqyhsl 



iqlsfyf ssg 
Iklevyptsl 
snkklmkklq 
syvpwmskl 
kadiwlmrns 
mvsnmteera 
ithggangiy 
vindpsyken 
dvigf llacv 



scgkvlvwaa 
tktef eniim 
esrfdivf ad 
sdqmtfmerv 
wnf kfphpf 1 
nviatalaki 
eaiyhgipmv 
imklsriqhd 
atvlf iitkc 



eyslwmnmkt 
qlvkrlseiq 
aylpcgella 
knmlyvlyf d 
pnvdfvgglh 
pqkvlwrfdg 
giplf fdqpd 
qpvkpldrav 
elf cfwkf ar 



ilkelvqrgh 
kdtfwlpf sq 
elf nipfvys 
fwf qifnmkk 
ckpakplpke 
nkpdalglnt 
niahmkakga 
fwiefvmrhk 
kgkkgkrd 



evtvlassas 
eqeilwaind 
hsf spgysf e 
wdqfysevlg 
meefvqssge 
rlykwipqnd 
avrvdf ntms 
gakhlrvaah 
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□ 2912330CD1 

□ g549155 



CLUSTAL W (1.7) Multiple Sequence Alignments 



Sequence format is Pearson 
Sequence 1: 2912330CD1 529 aa 

Sequence 2: g549155 528 aa 

Start of Pairwise alignments 
Aligning . . . 

Sequences (1:2) Aligned. Score: 80 
Start of Multiple Alignment 
There are 1 groups 
Aligning . . . 

Group 1: Sequences: 2 Score: 6470 

Alignment Score 2761 

CLUSTAL- Alignment file created [baaOSayOy . aln] 
CLUSTAL W (1.7) multiple sequence alignment 



291233 0CD1 MSMKWTSALLLIQLSCYFSSGSCGKVLVWPTEFSHWMNIKTILDELVQRGHEVTVLASSA 

g549155 MALKWT-TVLLIQLSFYFSSGSCGKVLWAAEYSLWMNMKTILKELVQRGHEVTVLASSA 
*..*** ..★***** ************* .*.* ***.**** **************** 

291233 0CD1 SI SFDPNSPSTLKFEVYPVSLTKTEFEDIIKQLVKRWAELPKDTFWSYFSQVQEIMWTFN 

g549155 SILFDPNDSSTLKLEVYPTSLTKTEFENIIMQLVKRLSEIQKDTFWLPFSQEQEILWAIN 
** ★*** ****.**** ********.** ***** .*. ***** *** ***.*..* 

2912330CD1 DILRKFCKDIVSNKKLMKKLQESRFDWLADAVFPFGELLAELLKIPFVYSLRFSPGYAI 

g549155 DIIRNFCKDWSNKKLMKKLQESRFDIVFADAYLPCGELLAELFNIPFVYSHSFSPGYSF 
**.*.****.****************.*„*** .* *******..****** ★****.. 

291233 0CD1 EKHSGGLLFPPS WPVVMSELSDQMTFIERVKNMIYVLYFEFWFQIFDMKKWDQFYSEVL 

g549155 ERHSGGF I F P PS YVP WMS KLS DQMTFMERVKNMLYVLYFDFWFQ I FNMKKWDQF YS EVL 

* .***★ . .*********** . ******* .****** . ***** .****** .************ 

291233 0CD1 GRPTTLSETMAKADIWLIRNYWDFQFPHPLLPNVEFVGGLHCKPAKPLPKEMEEFVQSSG 

g549155 GRPTTLSETMRKADIWLMRNSWNFKFPHPFLPNVDFVGGLHCKPAKPLPKEMEEFVQSSG 
********** *****★.*★ *.*.****.****.************************* 



2912330CD1 ENGVVWSLGSMVSNTSEERANVIASALAKIPQKVLWRFDGNKPDTLGLNTRLYKWIPQN 
g549155 ENGVVWSLGSMVSNMTEERAWIATALAKIPQKVLWRFDGNKPDALGLOT 
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*************** .******** .******************* .************** 

2912330CD1 DLLGHPKTKAFITHGGMNGIYEAIYHGVPMVGVPIFGDQLDNIAHMKAKGAAVEINFKTM 

g549155 DLLGHPKTRAFITHGGANGIYEAIYHGIPMVGIPLFFDQPDNIAHMKAKGAAVRVDFNTM 
********.******* **********.****.*.* ** ************* ..*.** 

291233 0CD1 TSEDLLRALRTVITDSSYKENAMRLSRIHHDQPVKPLDRAVFWIEFVMRHKGAKHLRSAA 

g549155 SSTDLLNALKTVINDPSYKENIMKLSRIQHDQPVKPLDRAVFWIEFVMRHKGAKHLRVAA 
. * *****.*** ****** *.****.**************************** * * 

291233 0CD1 HDLTWFQHYS IDVIGFLLTCVATAIFLFTKCFLFSCQKFNKTRKI EKRE 

g549155 HNLTWFQYHSLDVIGFLLACVATVLFIITKCCLFCFWKFARKGKKGKRD 
*.*****..*.*******.**** .*..*** * * * * . * ★*. 
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NCgl Sequence Viewer 



http:// www.ncbi.nlm. nih.gov/entrez/ viewer. fcgi?db=protein&db=prot.. 




Entrez 



PubMed 



sj w si 

Nucleotide 




Protein 



Genome 



Structure 



Search Protein 



IB to 

Limits 




*<^- V? Hi-' ^ O ^ "tv * 

PMC Taxonomy 



Books 



jlaliiSll I default _ Of show: [20 
Hi: NPJXH064 . UDP glycosyltrans...[gi:4507823] 



BLink, Domains, Links 



LOCUS 

DEFINITION 

ACCESSION 

VERSION 

DBS0URCE 

KEYWORDS 

SOURCE 

ORGANISM 



NP_001064 529 aa linear PRI 23-DEC-2003 

UDP glycosyltransf erase 2 family, polypeptide Bll [Homo sapiens] . 
NP_001064 

NP_001064.1 GI: 4507823 
REFSEQ: accession NM_001073.1 



Homo s ap i en s ( human ) 
Homo sapiens 



Craniata; Vertebrata; 
Catarrhini ; Hominidae 



Euteleostomi ; 
: Homo . 



Eukaryota; Metazoa; Chordata; 
Mammalia; Eutheria; Primates; 

1 (residues 1 to 529) 

Beaulieu,M. , Levesque,E. , Hum,D.W. and Belanger,A. 
Isolation and characterization of a human orphan 
UDP-glucuronosyltransf erase , UGT2B11 

Biochem. Biophys . Res. Commun. 248 (1), 44-50 (1998) 
9675083 

2 (residues 1 to 529) 

Jin,C.J., Miners, J. 0. , Lillywhite, K.J. and Mackenzie, P . I . 
cDNA cloning and expression of two new members of the human liver 
UDP-glucuronosyltransf erase 2B subfamily 
Biochem. Biophys. Res. Commun. 194 (1), 496-503 (1993) 
8333863 

PROVISIONAL REFSEQ : This record has not yet been subject to final 
NCBI review. The reference sequence was derived from AF016492 . 1 . 
Location/Qualifiers 
1. .529 

/ o rgan i sm= " Homo s api en s " 
/db„xref="taxon: 9606" 
/ c hr omo s ome = " 4 " 
/map="4ql3.3" 

1..529 

/product="UDP glycosyltransf erase 2 family, polypeptide 
Bll" 

Region 24 . . 527 

/region_name="UDP-glucoronosyl and UDP-glucosyl 

transferase" 

/note="UDPGT" 

/ db_xr e f = " CDD : 22944 " 



REFERENCE 
AUTHORS 
TITLE 

JOURNAL 
PUBMED 
REFERENCE 
AUTHORS 
TITLE 

JOURNAL 
PUBMED 
COMMENT 

FEATURES 

source 



Protein 



variation 



variation 



variation 



variation 



"dbSNP: 7688262" 

"R" 
11 C" 

"dbSNP:7697037" 



CDS 



70 

/replace="P" 
/replace="S" 

/ db„xr e f = " dbSNP : 7697482 " 
101 

7replace="Q M 
/replace="R" 
/db_xref = 
156 

/replace= 
/replace= 

/db_xref= _____ 
289 

/replace=" P" 
/replace="L" 

/ db_xre f = " dbSNP : 3890590 " 
1. .529 

/gene= ,, UGT2Bll" 

/coded_by="NM_001073 . 1 : 10 . . 1599" 

/note="go_component : microsome [goid 0005792] [evidence 
IEA] ; 

go_component : integral to membrane [goid 0016021] 
[evidence IEA] ; 

go_f unction: UDP-glucuronosyltransf erase [goid 0003981] 
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http://www.ncbi.nlm.nih.gov/entrez/viewer.fcgi?db=protein&db=prot.., 



[evidence E] [pmid 8333863]; 

go_f unction: glucuronosyltransf erase activity [goid 
0015020] [evidence IEA] ; 

go_process: estrogen metabolism [goid 0008210] [evidence 
TAS] [pmid 8333863] ; 

go_process: xenobiotic metabolism [goid 0006805] [evidence 

TAS] [pmid 8333863] n 

/db_xre£="GeneID: 10720 " 

/db_xref=" LocusID: 10720 n 

/db_xref="MIM:603064" 



ORIGIN 



1 mtlkwtsvll 

61 silfdpndas 

121 difrnfckdv 

181 erhsgglifp 

241 grpttlfetm 

301 engwvf slg 

361 dllghpktra 

421 sstdllnalk 

481 hdltwfqyhs 



lihlscyf ss 
tlkf evypts 
vsnkkvmkkl 
psyipivmsk 
gkadiwlmrn 
svisnmtaer 
f ithggangi 
tvindplyke 
ldvigf llac 



gscgkvlvwa 
ltktef enii 
qesrfdivf a 
Isdqmtfmer 
swsf qfphpf 
anviatalak 
yeaiyhgipm 
nimklsriqh 
vatvif iitk 



aeyshwmnmk 
mqqykrwsdi 
davfpcgell 
vknmiyvlyf 
lpnvdfvggf 
ipqkvlwrf d 
vgiplf fdqp 
dqpvkpldra 
f elf cfwkf a 



// 



tilkelvqrg 
rkdsfwlyf s 
aallnirf vy 
dfwfqmsdmk 
hckpakplpk 
gnkpdalgln 
dniahmkakg 
vfwiefvmph 
rkgkkgkrd 



hevtvlassa 
qeqeilwely 
slrf tpgyti 
kwdqfysevl 
emeef vqssg 
trlykwipqn 
aavrldf ntm 
kgakhlrvaa 
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SeqServer 

biology in silica 



ClustalW Results 



Seque nces 



Retrieval I BLA3T2 



ClustalV, 



Translation 



Confidential - Property of Incyte Corporation SeqServer Version 4.6 Jan 2002 



O 2912330CD1 
□ g4507823 



CLUSTAL W (1.7) Multiple Sequence Alignments 



Sequence format is Pearson 
Sequence 1: 2912330CD1 529 aa 

Sequence 2: g4507823 529 aa 

Start of Pairwise alignments 
Aligning . . . 

Sequences (1:2) Aligned. Score: 80 
Start of Multiple Alignment 
There are 1 groups 
Aligning . . . 

Group 1: Sequences: 2 Score: 6500 

Al ignment Score 2777 

CLUSTAL-Alignment file created [baadWaOez . aln] 
CLUSTAL W (1.7) multiple sequence alignment 



2912330CD1 MSMKWTSALLLIQLSCYFSSGSCGKVLVWPTEFSHWMNIKTILDELVQRGHEVTVLASSA 
g4507823 MTLKWTSVLLLIHLSCYFSSGSCGKVLWAAEYSHWMNMKTILKELVQRGHEVTVLASSA 

* . . * * * * * * * * .**************** ^ . * .***** . * * * * **************** 

2912330CD1 SISFDPNSPSTLKFEVYPVSLTKTEFEDIIKQLVKRWAELPKDTFWSYFSQVQEIMWTFN 
g4507 823 SILFDPNDASTLKFEVYPTSLTKTEFENIIMQQVKRWSDIRKDSFWLYFSQEQEILWELY 

* * * * * * ********* > ********.** * ****... * * . * * **** * * * . * . 

291233 0CD1 DILRKFCKDIVSNKKLMKKLQESRFDWLADAVFPFGELLAELLK1PFVYSLRFSPGYAI 

g4507823 DIFRNFCKDWSNKKVMKKLQESRFDIVFADAVFPCGELLAALLNIRFVYSLRFTPGYTI 
**.*.****.*****.**********.*.****** ***** **.* *******.***.* 

2912330CD1 EKHSGGLLFPPSWPVVMSELSDQMTFIERVKNMIYVLYFEFWFQIFDMKKWDQFYSEVL 

g4507823 ERHSGGLIFPPSYIPIVMSKLSDQMTFMERVKNMIYVLYFDFWFQMSDMKKWDQFYSEVL 
* .****★ .****★,* . * * * .******* .************ .**** . ************* 

291233 0CD1 GRPTTLSETMAKADIWLIRNYWDFQFPHPLLPNVEFVGGLHCKPAKPLPKEMEEFVQSSG 

g4507 823 GRPTTLFETMGKADIWLMRNSWSFQFPHPFLPNVDFVGGFHCKPAKPLPKEMEEFVQSSG 
****** *** ***★★*.** * ******.****.****.******************** 



291233 0CD1 ENGVWFSLGSMVSNTSEERANVIASALAKIPQKVLWRFDGNKPDTLGLNTRLYKWIPQN 
g4 5 07 8 2 3 ENGVWFSLGSVISNMTAERANVI ATALAKIPQKVLWRFDGNKPDALGLNTRLYKWIPQN 
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**★★***★***..** . * ******.************★******.**********★** * 

291233 0CD1 DLLGHPKTKAFITHGGMNGI YEAI YHGVPMVGVPIFGDQLDNI AHMKAKGAAVEINFKTM 

g4 5 0 7 8 2 3 DLLGHPKTRAFITHGGANGI YEAI YHGI PMVGI PLFFDQPDNIAHMKAKGAAVRLDFNTM 

********.******* a*********.****.*.* ** *************..*.** 

291233 OCD1 TSEDLLRALRTVITDSSYKENAMRLSRIHHDQPVKPLDRAVFWIEFVMRHKGAKHLRSAA 

g45 07823 SSTDLLNALKTVINDPLYKENIMKLSRIQHDQPVKPLDRAVFWIEFVMPHKGAKHLRVAA 
.* *** *★.*** * **** *.****.**********★******** ******** ** 

291233 0CD1 HDLTWFQHYS IDVIGFLLTCVATAI FLFTKCFLFSCQKFNKTRKIEKRE 

g4 5 0 7 8 2 3 HDLTWFQYHSLDVIGFLLACVATVI FI ITKFCLFCFWKFARKGKKGKRD 

*******.„*.*******.**** **..** * * * * . * * * . 
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/' . ^, Exhibit B 

*/ r y with Response dated 4/26/04 

In USSN: 09/980,729 

BLAST2 Search Results 



SeqServer 

biology in silko 



Sequences I Help 




Confidential -- Property of Incyte Corporation SeqServer Version 4.6 Jan 2002 

Program: blastp 
Sequence ID(s) : 

O 2912330CD1 vs. o£137mamp, gb!37rodp 



NCBI- BLASTP 2.0.10 [Aug-26-1999 3 




Reference: Altschul, Stephen F., Thomas L. Madden, Alejandro A. Schaffer, 
Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997), 
"Gapped BLAST and PSI-BLAST: a new generation of protein database search 
programs", Nucleic Acids Res. 25:3389-3402. 

Query= 2912330CD1 

(529 letters) 

Database : gbl37mamp 

27,779 sequences; 6,697,212 total letters 

Searching done 

Sequences producing significant alignments: 

0 g!65799 UGT2B14; UDP-glucuronosyl transferase 
El g2444022 UGT2B13; UDP-glucuronosyl transf erase ; EC 2.4.1.17 
W g!65797 UGT2B13; UDP-glucuronosyl transf erase 
F? 1 g!65801 UGT2C1; UDP-glucuronosyl transf erase 
Wl g2773068 UGT1A; UDP-glucuronosyl transf erase ; EC 2.4.1.17 
0 g2842546 UGT1A1; UDP-glucuronosyl transf erase 
ES g2773066 UGT1A; UDP-glucuronosyl transf erase ; EC 2.4.1.17 
W g483789 UGT1-4; UDP-glucuronosyl transf erase 
^ g2605508 UDP-glucuronosyltransf erase 

g4760841 sheUGTlA6; UDP-glucuronosyltransf erase 

>gl65799 UGT2B14; UDP-glucuronosyl transf erase 
Length = 530 

Score = 779 bits (1990), Expect = 0.0 
Identities = 360/530 (67%), Positives = 439/530 (81%), Gaps = 1/530 (0%) 

Query: 1 MSMKWTSALLLI-QLSCYFSSGSCGKVLWPTEFSHWMNIKTILDELVQRGHEVTVLASS 59 
MS+K S LLL+ QLSC F +GSCGKVLVWP +FS WMN+ ILDELV+RGHEV VL +S 



Score E 

(bits) Value 

779 0.0 
758 0.0 
758 0.0 

623 e-179 

465 e-131 

437 e-123 

435 e-122 

428 e-120 

428 e-120 

421 e-118 
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Sbjct : 


1 


Query : 


60 


Sbjct : 


61 


Query: 


120 


Sbjct : 


121 


Query : 


180 


Sbjct : 


181 


Query: 


240 


Sbjct : 


241 


Query : 


300 


Sbjct : 


301 


Query : 


360 


Sbjct : 


361 


Query: 


420 


iJXJ J V— L- • 


491 


Query : 


480 


Sbjct : 


481 



MSVKHVSVLLLLLQLSCCFRTGSCGKVLWPMDFSLWMlSn^WILDELVRRGHEVIVLRNS 60 

ASISFDPNSPSTLKFEVYPVSLTKTEFEDIIKQLVKRWAELPKDTFWSYFSQVQEIMWTF 11S 
ASI DP+ + +KFE +P++ TK + ED+ V W +++ W YFS +Q+ + + 
ASIFIDPSKQANIKFETFPIAATKDDLEDLFVHYVSTWTNARQNSQWKYFSLLQKLFSEY 12( 

ND I LRKFCKD I VSNKKLMKKLQE S RFDWL ADAVF PFGELLAELLKI PFVYS LRFS PGYA 1 7 i 

+D CK++V NK LM KLQESRFD++L+DA+ P GELLAELLKI PFVYS LRF+PGY 

S DSC ENACKEWFNKTLMTKLQESRFDILLSDAIGPCGELLAELLKI PFVYS LRFTPGYT 18 ( 

IEKHSGGLLFPPSYVPWMSELSDQMTFIERVKNMIYVLYFEFWFQIFDMKKWDQFYSEV 23S 
+EK+SGGL PPSYVP+++S+LS +MTF+ERV NM+ +LYF+FWFQ+F+ K+WDQFYSEV 
MEKYSGGLSVPPSYVPIILSDLSGKMTFMERVNl^ 24 C 

LGRPTTLS ETMAKAD I WL I RNYWDFQF PH PLL PNVEF VGGLHC KP AKPL PKEMEEFVQS S 295 
LGRP T SE + KAD+WLIR+YWD +FP P L PN+ + F VGGLHC KPAKPLPKEMEEFVQS S 
LGRPVTFSELVGKADMWLIRSYWDLEFPRPTLPNIQFVGGLHCKPAKPLPKEMEEFVQSS 30( 

GENGVWFSLGSMVSNTSEERANVI ASALAKI PQ KVLWRF DGNK PDTLGLNTRL YKW I PQ 355 
GE GVWFSLGSMVSN +EERAN+ IASA A+ + PQKV+WRFDG KP+TLG NTR+Y WIPQ 
GEEGVWFSLGS MVSNMTEERANLIASAFAQLPQKVIWRFDGQKPETLGPNTRIYDWIPQ 3 6 ( 

NDLLGHPKTKAFITHGGMNGIYEAIYHGVPMVGVPIFGDQLDNIAHMKAKGAAVEINFKT 415 
NDLLGHPKTKAF+THGG NGIYEAI+HG+PMVG+P+FG+Q DNIAHM AKGAA+ +N+KT 
NDLLGH PKTKAF VTHGGANG IYEAI HHG I PMVGL PLFGEQ PDNI AHMT AKGAAI RLNWKT 4 2 ( 

MTSEDLLRALRTVITDSSYKENAMRLSRIHHDQPVKPLDRAVFWIEFVMRHKGAKHLRSA 47 5 
M+SEDLL AL+TVI D SYKEN M LS IHHDQP+KPLDRAVFWIE+VMRHKGAKHLR A 
MSSEDLLNALKTVINDPSYKENVMTLSSIHHDQPMKPLDRAVFWIEYVMRHKGAKHLRVA 48 C 

AHDLTWFQHYS I DVI GFLLTC VATAI F LFTKC FLF SC QKFNKTRK I EKRE 529 
AHDLTWFQ++S+DV+GFL++C A IFL K +LF QK K K +KR+ 



>g2444022 UGT2B13; UDP-glucuronosyltransf erase ; EC 2.4.1.17 
Length = 523 

Score = 758 bits (1937), Expect = 0.0 

Identities = 358/522 (68%), Positives = 422/522 (80%), Gaps = 1/522 (0%) 



Query: 9 LLLIQLSCYFSSGSCGKVLVWPTEFSHWMNIKTILDELVQRGHEVTVLASSASISFDPNS 68 

LLL+QLSC FSSGSCGKVLWP EFSHWMN+KTILD LVQRGH VTVL SSASI + N 
Sbjct: 2 LLLLQLSCCFSSGSCGKVLWPMEFSHWMNMKTILDALVQRGHAVTVLRSSASILVNSND 61 

Query: 69 PSTLKFEWPVSLTKTEFEDIIKQ-LVKRWAELPKDTFWSYFSQVQEIMWTFNDILRKFC 127 

S + FE +P + TK E E L K ++ KD W YF Q+ ++D C 

Sbjct : 62 ESGITFETFPTTSTKDEMEAFFiyTYWLNKLTNDVSKDALWEYFQTWQKFFMEYSDNYENIC 121 

Query: 128 KDIVSNKKLMKKLQES RFDWL ADAVF PFGELLAELLKI PFVYS LRF SPG YAIEKHSGGL 187 

KD+V NKK+M KLQES RFDWL AD + P GELLAELL P VYS+RF+PGY EK+SGGL 
Sbjct: 122 KDLVLNKKIMAKLQESRFDWLADPIAPCGELLAELLNRPLVYSVRFTPGYTYEKYSGGL 181 

Query: 188 LFPPSWPVVMSELSDQMTFIERVKNMIWLYFEFWFQIFDMKKWDQFYSEVLGRPTTLS 247 

LFPPSYVPV+MS+LS QMTF+ERVKNM+++LYF+FWFQ+ ++K+WDQF SEVLGRP T S 
Sbjct: 182 LFPPSWPVIMSDLSGQMTFMERVKIsnyiLWMLYFDFWFQMLNVKRWDQFCSEVLGRPVTFS 241 

Query: 248 ETMAKADIWL I RNYWDFQF PHPLLPNVEFVGGLHCKPAKPL PKEMEEFVQS SGENGVWF 307 

E + KA+IWLIR+YWD +FP PLLPN FVGGLHCKPA+PLPKEMEEFVQSSGE GVWF 
Sbjct: 242 ELVGKAEIWLIRSYWDLEFPRPLLPNSYFVGGLHCKPAQPL PKEMEEFVQS SGEEGVWF 301 

Query: 308 SLGSMVSNTSEERANVI ASALAKI PQKVLWRFDGNKPDTLGLNTRLYKWI PQNDLLGHPK 367 

SLGSM+SN +EERANVIAS LA+ + PQKVLW+FDG KPD LG NT+LYKWI PQNDLLGH 
Sbjct: 302 SLGSMISNLTEERANVIASTLAQLPQKVLWKFDGKKPDNLGTNTQLYKWIPQNDLLGHTV 361 

Query: 368 TKAFITHGGMNGIYEAIYHGVPMVGVPIFGDQLDNIAHMKAKGAAVEINFKTMTSEDLLR 427 
+KAFITHGG NG++EAIYHG+PMVG+P+F DQ DN+AHM+AKGAA+ +++KTM+S D L 
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Sbjct: 362 SKAFITHGGANGVFEAIYHGIPMVGLPLFADQHDNLAHMRAKGAAIRLDWKTMSSSDFLN 421 

Query: 428 ALRTVITDSSYKENAMRLSRIHHDQPVKPLDRAVFWIEFVMRHKGAKHLRSAAHDLTWFQ 487 

AL+TVI D SYKE AM LSRIHHDQP+KPLD+A+FWIEFVMRHKGAKHLR AAHDLTWFQ 
Sbjct: 422 ALKTVINDPSYKEKAMTLSRIHHDQPMKPLDQAIFWIEFVMRHKGAKHLRVAAHDLTWFQ 481 

Query: 488 HYSIDVIGFLLTCVATAIFLFTKCFLFSCQKFNKTRKIEKRE 529 

++S+DVIGFLL C+ +L KC+L Q T K +KR+ 
Sbjct: 482 YHSLDVIGFLLACLTITTYLVIKCWLLVYQNILMTGKKKKRD 523 



> g!65797 UGT2B13; UDP-glucuronosyl trans f erase 
Length =531 

Score = 758 bits (1936), Expect = 0.0 

Identities = 362/522 (69%), Positives = 423/522 (80%), Gaps = 1/522 (0%) 

Query: 9 LLLIQLSCYFSSGSCGKVLWPTEFSHWMNIKTILDELVQRGHEVTVLASSASISFDPNS 68 

LLL+QLSC FSSGSCGKVLVWP EFSHWMN+KTILD LVQ+GHEVTVL SSASI N+ 
Sbjct: 10 LLLLQLSCCFSSGSCGKVLWPMEFSHWMNMKTILDALVQQGHEVTVLRSSAS IVIGSNN 69 

Query : 69 PSTLKFEVYPVSLTKTEFEDIIKQ-LVKRWAELPKDTFWSYFSQVQEIMWTFNDILRKFC 127 

S +KFE + S K E E+ K + +++W FS + ++ ++DI C 

Sbjct : 7 0 ESGIKFETFHTSYRKDEIENFFMDWFYKMIYNVSIESYWETFSLTKMVILKYSDICEDIC 129 

Query: 128 KDIVSNKKLMKKLQESRFDWLADAVFPFGELLAELLKIPFVYSLRFSPGYAIEKHSGGL 187 

K+++ NKKLM KLQESRFDWLAD V P GELLAELLKIP VYSLR GY ++KH GGL 
Sbjct: 130 KEVILNKKLMTKLQESRFDWLADPVSPGGELLAELLKIPLVYSLRGFVGYMLQKHGGGL 189 

Query: 188 LFPPSYVPVVMSELSDQMTFIERV1CNMIWLYFEFWFQIFDMKKWDQFYSEVLG 247 

L PPSYVPV+MS L QMTF + ERV+N+ + VLYF+FWF F+ K+WDQFYSEVLGRP T 
Sbjct: 190 LLPPSWPVM4SGLGSQMTFMERVQNLLCVLYFDFWFPKFNEKRWDQFYSEVLGRPVTFL 249 

Query: 248 ETMAJCADIWLIRJSTTODFQFPHPLLPWEFVGGLHCKPAKPLPKEMEEFVQSSGENGVVV^ 307 

E M KAD+WLIR+YWD +FP PLLPN +F+GGLHCKPAKPLP+EME+FVQSSGE GWVF 
Sbjct: 250 ELMGKADMWLIRSYWDLEFPRPLLPNFDFIGGLHCKPAKPLPQEMEDFVQSSGEEGVWF 309 

Query: 308 SLGSMVSNTSEERANVIASALAKI PQKVLWRFDGNKPDTLGLNTRLYKWI PQNDLLGHPK 367 

SLGSM+SN +EERANVIASALA++PQKVLWRF+G KPD LG NTRLYKWI PQNDLLGHPK 
Sbjct: 310 S LGSMISNLTEERAIWI AS ALAQLPQKVIjWRFEGKKPDMLGSNTRLYKWI PQNDLLGHPK 369 

Query: 3 68 TKAFITHGGMNGIYEAIYHGVPMVGVPIFGDQLDNIAHMKAKGAAVEINFKTMTSEDLLR 427 

TKAFITHGG NG++EAIYHG+PMVG+P+FGDQLDNI +MKAKGAAV++N KTM+S DLL 
Sbjct: 370 TKAFITHGGANGVFEAIYHGIPMVGLPLFGDQLDNIWMKAKGAAVKLNLKTMSSADLLN 429 

Query: 428 ALRTVITDSSYKENAMRLSRIHHDQPVKPLDRAVFWIEFVMRHKGAKHLRSAAHDLTWFQ 487 

AL+TVI D SYKENAM LSRIHHDQP+KPLDRAVFWIE+VMRHKGAKHLR AAHDLTW+Q 
Sbjct: 430 ALKTVINDPSYKENAMTLSRIHHDQPMKPLDRAVFWIEYVMRHKGAKHLRVAAHDLTWYQ 489 

Query: 488 HYSIDVIGFLLTCVATAIFLFTKCFLFSCQKFNKTRKIEKRE 529 

++S+DVIGFLL CVA +L KC L + K +KR+ 

Sbjct: 490 YHSLDVIGFLLACVAITTYLIVKCCLLVYRYVLGAGKKKKRD 531 



>gl65801 UGT2C1; UDP-glucuronosyl transferase 
Length = 502 

Score = 623 bits (1590), Expect = e-179 

Identities = 301/499 (60%), Positives = 383/499 (76%), Gaps = 2/499 (0%) 

Query: 32 EFSHWMNIKTILDELVQRGHEVTVLASSASISFDPNSPSTLKFEVYPVSLTK-TEFEDII 90 

+FSHW+N+K IL+EL RGHE+TVL S S+ D ++ EV + +TK T E+ + 

Sbjct : 5 DFSHWINLKVILEELQLRGHEITVLVPSPSLLLD-HTKIPFNVEVLQLQVTKETLMEELN 63 

Query: 91 KQLVKRWAELPKDTFWSYFSQVQEIMWTFNDILRKFCKDIVSNKKLMKKLQESRFDWLA 150 

L ELP ++W ++ E+ F+ LR+ C ++NK+L+ +L+ ++FD+ LA 

Sbjct: 64 TVLYMSSFELPTLSWWKVLGKMVEMGKQFSKNLRRVCDSAITNKELLDRLKAAKFDICLA 123 
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Query: 151 DAVFPFGELLAELLKIPFVYSLRFSPGYAIEKHSGGLLFPPSYVPWMSELSDQMTFIER 210 

D + GEL+AELL IPFVYS RFS G IE+ GL P SYVP S L+D M+F++R 
Sbjct: 124 DPLAFCGELVAELLNIPFVYSFRFSIGNIIERSCAGLPTPSSYVPGSTSGLTDNMSFVQR 183 

Query: 211 VKNMIWLYFEFWFQIFDMKKWDQFYSEVLGRPTTLSETMAKADIWLIRNYWDFQFPHPL 270 

+KN + L + F F + +WD++YS+VLGR TT+ E M KA++WLIR+YWDF+FP P 
Sbjct: 184 LKNWLLYLMNDMMFSHFMLSEWDEYYSKVLGRRTTICEIMGKAEMWLIRSYWDFEFPRPF 243 

Query: 271 LPNVEFVGGLHCKPAKPLPKEMEEFVQSSGENGVWFSLGSMVSNTSEERANVIASALAK 330 

LPN E+VGGLHCKPAKPLP+E+EEFVQSSG +GVWF+LGSM+ N + EER+N+ 1 AS ALA+ 
Sbjct: 244 LPNFEYVGGLHCKPAKPLPEELEEFVQSSGNDGVWFTLGSMIQNLTEERSNLIASALAQ 303 

Query: 331 IPQKVLWRFDGNKPDTLGLNTRLYKWIPQNDLLGHPKTKAFITHGGMNGIYEAIYHGVPM 390 

IPQKVLWR+ G KP TLG NTRL+ +WI PQNDLLGHPKT+AF ITHGG NG+YEAIYHGVPM 
Sbjct: 304 IPQKVLWRYTGKKPATLGPNTRLFEWIPQNDLLGHPKTRAFITHGGTNGLYEAIYHGVPM 363 

Query: 391 VGVPIFGDQLDNIAHMKAKGAAVEINFKTMTSEDLLRALRTVITDSSYKENAMRLSRIHH 450 

VG+P+FGDQ DNIA +KAKGAAV+++ + MT+ LL+AL+ VI + SYKENAM+LSRIHH 
Sbjct: 3 64 VGIPLFGDQPDNIARVKAKGAAVDVDLRIMTTSSLLKALKDVINNPSYKENAMKLSRIHH 423 

Query: 451 DQPVKPLDRAVFWIEFVMRHKGAKHLRSAAHDLTWFQHYSIDVIGFLLTCVATAIFLFTK 510 

DQP+KPLDRAVFWIEFVMRHKGA+HLR AAHDLTWFQ+YS+DV+ FLLTCVAT IFL K 
Sbjct: 424 DQPLKPLDRAVFWIEFVMRHKGARHLRVAAHDLTWFQYYSLDVWFLLTCVATIIFLAKK 483 



Query: 511 CFLFSCQKFNKTRKIEKRE 529 

C LF ++F KT KRE 
Sbjct: 484 CCLFFYRRFCKTGNKRKRE 502 



>g2773068 UGT1A; UDP-glucuronosyl transferase ; EC 2.4.1.17 
Length = 533 

Score = 465 bits (1183), Expect = e-131 

Identities = 238/518 (45%), Positives = 330/518 (62%), Gaps = 7/518 (1%) 

Query: 11 LIQLSCYFSSGSCGKVLWPTEFSHWMNIKTILDELVQRGHEVTVLASSASISFDPNSPS 70 

L+ C GKVLV PT+ SHW+++K L EL RGHE+ V++ + + 

Sbjct: 15 LLLFLCVGPRAEGGKVLVLPTDGSHWLSMKKALQELHARGHEIVWSPEVNLHIKKEDFF 74 

Query: 71 TLKFEVYPVSLTKTEFEDIIKQLVKRWAELPKDTFWSYFSQVQEIMWTFNDILRKFCKDI 130 

TL+ Y +S T+ EF D L + K F F ++ E + T + I ++ CK++ 

Sbjct : 75 TLR--TYAISYTQEEFNDFF--LGHSYLVFEKGHFLKMFLKIMENLKTASFIFQRSCKEL 130 

Query: 131 VSNKKLMKKLQESRFDWLADAVFPFGELLAELLKIPFVYSLRFSPGYAIEKHSGGLLFP 190 

+ NK+L+ L S FDWL D V+P G +LA+ L +P V+ LR P ++ P 
Sbjct: 131 MHNKELIGHLNSSSFDWLTDPVYPCGAVLAKYLSLPAVFFLRSVP-CDLDFEGTQCPNP 189 

Query: 191 PSWPVVMSELSDQMTFIERVTCNMIWLYFEFWFQIFD^ 250 

SY+P +++ SD MTF++RVKNM+Y L ++ I + SE+L R ++ + 

Sbjct: 190 SSYIPRLLTMNSDHMTFLQRVKNMLYPLSLKYICHIA-FTPYASLASELLQREVSVVDVF 248 

Query: 251 AKADIWLIRNYWDFQFPHPLLPNVEFVGGLHCKPAKPLPKEMEEFVQSSGENGVVVFSLG 310 

+ A +WL R + +P P+ + PN+ F+GG++C KPL +E E +V +SGE+G+WFSLG 
Sbjct: 249 SSASMWLFRGDFVLDYPRPVMPNMVFIGGINCANRKPLSQEFEAYVNASGEHGIWFSLG 308 

Query: 311 SMVSNTSEERANVI ASALAKI PQKVLWRFDGNKPDTLGLNTRLYKWI PQNDLLGHPKTKA 370 

SMVS +E+A IA AL KIPQ VLWR+ G P L NT L KW+ PQNDLLGH PK +A 
Sbjct: 309 SWSAIPKEKAMEIADALGKIPQTVLWRYTGTPPPNLAKNTILVICWLPQNDLLGHPKAF^ 368 

Query: 371 FITHGGMNGIYEAIYHGVPWGVPIFGDQLDNIAHMKAKGAAVEINFKTMTSEDLLRALR 430 

FITH G +GIYE I +GVPMV +P+FGDQ+DN M+ +GA + +N MTSEDL L+ 
Sbjct: 369 FITHSGSHGIYEGICNGVPMVMLPLFGDQMDNAKRMETRGAGLTLNVLEMTSEDLANGLK 428 

Query: 431 TVITDSSYKENAMRLSRIHHDQPVKPLDRAVFWIEFVMRHKGAKHLRSAAHDLTWFQHYS 490 

VI D SYKEN MRLS +H D+P+ + PLD AVFW+EFVMRHKGA HLR AAHDLTW+Q++S 
Sbjct: 429 AVINDKSYKENIMRLSSLHKDRPIEPLDLAWWEFVMRHKGAPHLRPAAHDLTWYQYHS 488 
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Query: 491 IDVTGFLLTCVATAIFLFTKCFLFSCQK-FNKTRKIEK 527 

+DVIGFLL V +F+ KC F C+K F + + ++K 
Sbjct: 489 VDVIGFLLAIVLGIVFITYKCCAFGCRKCFGRKGRVKK 526 



> g2842546 UGT1A1 ; UDP-glucuronosyl transferase 
Length = 533 

Score = 437 bits (1111), Expect = e-123 

Identities = 237/521 (45%), Positives = 322/521 (61%), Gaps = 9/521 (1%) 

Query: 8 ALLLIQLSCYFSSGSCGWLWPTEFSHWMNIKTILDELVQRGHEVTVLASSASISFDPN 67 

+LLL L+ S G GK+L+ P + SHW+++ + + L QRGH+V V+A AS+ 
Sbjct : 14 SLLLCALNPLLSQG--GKLLLVPMDGSHWLSLFGVIQRLHQRGHDWWAPEASVYIKEG 71 

Query : 68 SPSTLKFEVYPVSLTKTEFEDIIKQLVKRWAELPKDTFWSYFSQVQEIMWTFNDILRKFC 127 

+ TLK YPV + + E L EKF + + + +L C 

Sbjct: 72 AF YTLKS - - Y PVPFRRVDVE AS FTGLGLG I FE - - KKPFLRRWAT YKRVKKDS ALLLS AC 127 

Query: 128 KDIVSNKKLMKKLQESRFDWLADAVF PFGELLAELLKI PFVYSLRFS PGYAI EKHSGGL 187 

+ + N++LM L ES FD +L D P G + +A L P V+ L P + + 
Sbjct: 128 SHLLYNEELMASLAESGFDAMLTDPFLPCGPIVALRLAWPWFFLNSLP-CGLDFQGTRC 186 

Query: 188 LFPPSYVPWMSELSDQMTFIERVKNMIYVLYFEFWFQIFDMKKWDQFYSEVLGRPTTLS 247 

PPSYVP V+S SD MTF++RVKNM+ +L E + + SEVL + T+ 

Sbjct: 187 PSPPSWPRVLSLNSDHMTFLQRVKNML-ILGSEGFLCNVVYSPYASLASEVLQKDVTVQ 245 

Query: 248 ETMAKADIWLIRNYWDFQFPHPLLPNVEFVGGLHCKPAKPLPKEMEEFVQSSGENGVWF 307 

+ M A +WL R+ + + P++PN+ F+GG++C PL +E E +V +SGE+G+WF 

Sbjct: 246 DLMGSASWLFRSDFWDYSRPIMPN1WFIGGINCAGKNPLSQEFEAYVNASGEHGIVVF 305 

Query: 308 SLGSMVSNTSEERANVI ASALAKI PQKVLWRFDGNKPDTLGLNTRLYKWI PQNDLLGHPK 367 

SLGSMVS +E+A I A AL KIPQ VLWR+ GPL NT L KW+ PQNDLLGHPK 
Sbjct: 306 SLGSMVSE I PKEKAME I ADALGKIPQTVLWRYTGTPPPNLAKNTILVKWL PQNDLLGHPK 365 

Query: 368 TKAF ITHGGMNGI YEAI YHGVPMVGVPI FGDQLDNI AHMKAKGAAVE INFKTMTS EDLLR 427 

+AFITH G +GIYE I +GVPMV +P+FGDQ+DN M+ +GA + +N MTSEDL 
Sbjct: 366 ARAF ITHSGS HG I YEGI CNGVPMVML PLFGDQMDNAKRMETRGAGLTLNVLEMTS EDLAN 425 

Query: 428 ALRTVITDSSYKENAMRLSRIHHDQPVKPLDRAVFWIEFVMRHKGAKHLRSAAHDLTWFQ 487 

AL+ VI D SYKEN MRLS +H D+P+ + PLD AVFW+EFVMRHKGA HLR AAHDLTW+Q 
Sbjct: 426 ALKAVINDKSYKENIMRLSSLHKDRPIEPLDLAVFWVEFVMRHKGAPHLRPAAHDLTWYQ 485 

Query: 488 HYSIDVIGFLLTCVATAIFLFTKCFLFSCQK-FNKTRKIEK 527 

++S+DVIGFLL V +F+ KC F C+K F K +++K 
Sbjct: 486 YHSVDVIGFLLAIVLGIVFITYKCCAFGCRKCFGKKGRVKK 526 



>g2773066 UGT1A; UDP-glucuronosyl transferase; EC 2.4.1.17 
Length = 533 

Score = 435 bits (1107), Expect = e-122 

Identities = 235/521 (45%), Positives = 322/521 (61%), Gaps = 9/521 (1%) 

Query: 8 ALLLIQLSCYFSSGSCGKVLWPTEFSHWMNIKTILDELVQRGHEVTVLASSASISFDPN 67 

+LLL L+ S G GK+L+ P + SHW+++ ++ L QRGH+V V+A AS+ 
Sbjct : 14 SLLLCALNPLLSQG--GKLLLVPMDGSHWLSLFGVIQRLHQRGHDWWAPEASVYIKEG 71 

Query : 68 SPSTLKFEVYPVSLTKTEFEDIIKQLVKRWAELPKDTFWSYFSQVQEIMWTFNDILRKFC 127 

+ TLK YPV + + E L EKF ++++LC 

Sbjct : 72 AFYTLKS- -YPVPFRREDVEASFTGLGLGVFE- -KKPFLQRWAT YKRVKKDS ALLLS AC 127 

Query: 128 KDIVSNKKLMKKLQESRFDWLADAVF PFGELLAELLKI PFVYSLRFS PGYAI EKHSGGL 187 

++ N++LM L ES FD +L D P G ++A L +P V+ L P ++ 
Sbjct: 128 SHLLYNEELMASLAESGFDAMLTDPFLPCGPIVALRLALPWFFLNSLP-CGLDFQGTRC 186 
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Query: 188 LFPPSWPVVMSELSDQMTFIERVKNMIYVLYFEFWFQIFDMKKWDQFYSEVLGRPTTLS 247 

PPSYVP V+S SD MTF++RVKNM+ +L E + + SEVL + T+ 

Sbjct: 187 PSPPSWPRVLSLNSDHMTFLQRVKNML-ILGSEGFLC1WWSPYASLASEVLQKDVTVQ 245 

Query: 248 ETMAKADIWLIRNYWDFQFPHPLLPNVEFVGGLHCKPAKPLPKEMEEFVQSSGENGVVVF 307 

+ M A +WL R+ + + P++PN+ F+GG++C PL +E E +V +SGE+G+WF 
Sbjct: 246 DLMGSASVWLFRSDFVKDYSRPIMPNMVFIGGINCAGKNPLSQEFEAYVNASGEHGIWF 305 

Query: 3 08 SLGSMVSNTSEERANVIASALAKI PQKVLWRFDGNKPDTLGLNTRLYKWI PQNDLLGHPK 367 

SLGSMVS +E+A IA AL KIPQ VLWR+ GPL NT L KW+ PQNDLLGHPK 
Sbjct: 306 SLGSMVSAI PKEKAMEIADALGKI PQTVLWRYTGTPPPNLAKNTILVKWL PQNDLLGHPK 365 

Query: 368 TKAFITHGGMNGIYEAIYHGVPMVGVPIFGDQLDNIAHMKAKGAAVEINFKTMTSEDLLR 427 

+AFITH G +GIYE I +GVPMV +P+FGDQ+DN M+ +GA + +N MTSEDL 
Sbjct: 366 ARAFITHSGSHGIYEGICNGVPMVMLPLFGDQMDNAKRMETRGAGLTLNVLEMTSEDLAN 425 

Query: 428 ALRTVITDSSYKENAMRLSRIHHDQPVKPLDRAVFWIEFVMRHKGAKHLRSAAHDLTWFQ 487 

L+ VI D SYKEN MRLS +H D+P++PLD AVFW+EFVMRHKGA HLR AAHDLTW+Q 
Sbjct: 426 GLKAVINDKSYKENIMRLSSLHKDRPIEPLDLAVFWVEFVMRHKGAPHLRPAAHDLTWYQ 485 

Query: 488 HYSIDVIGFLLTCVATAIFLFTKCFLFSCQK-FNKTRKIEK 527 

++S+DVIGFLL V +F + KC F C+K F + + ++K 
Sbjct: 486 YHSVDVIGFLLAIVLGIVFITYKCCAFGCRKCFGRKGRVKK 526 



> g483789 UGT1-4; UDP-glucuronosyltransf erase 
Length = 532 

Score = 428 bits (1089), Expect = e-120 

Identities = 227/518 (43%), Positives = 329/518 (62%), Gaps = 8/518 (1%) 

Query: 11 LIQLSCYFSSGSCGKVI.WPTEFSHWMNIKTILDELVQRGHEVTVLASSASISFDPNSPS 70 

L+ L C GKVLV P + S W++++ ++ ++ RGH+V VL + + 

Sbjct: 15 LLLLLCVLPWAEGGKVLWPMDGSPWLSLREWRDVHARGHQVLVLGPEVTMHIKGEDFF 74 

Query: 71 TLKFEVYPVSLTKTEFEDIIKQLVKRWAELPKDTFWSYFSQVQEIMWTFNDILRKFCKDI 130 

TL + Y +K EF+ ++++ + + ■ P+ + + ++ + F+ + + C ++ 

Sbjct: 75 TL--QTYATPYSKEEFDQLMQRNYQMIFK-PQHSLKTLLETMENLK-KFSMLCSRSCWEL 130 

Query: 131 VSNKKLMKKLQESRFDWLADAVFPFGELLAELLKIPFVYSLRFSPGYAIEKHSGGLLFP 190 

+ NK L+K L ES FDWL D + G LLA+ L +P V+ LRF ++ P 

Sbjct: 131 LHNKPLIKHLNESSFDWLTDPLDLCGALLAKYLSVPSVFLLRFIL-CDLDFEGTQCPNP 189 

Query: 191 PSWPVVMSELSDQMTFIERVKNMIWLYFEFWFQIFDMKKWDQFYSEVLGRPTTLSETM 250 

SY+P + + + SD M+F++RVKNM+Y L ++ I + SE+ R +L + + 

Sbjct: 190 SSYIPRMLTMNSDHMSFLQRVKNMLYPLMMKYTCHI-SYDPYASLASELFQREVSLVDIL 248 

Query: 251 AKADIWLIRNYWDFQFPHPLLPNVEFVGGLHCKPAKPLPKEMEEFVQSSGENGWVFSLG 310 

+ A +WL R + +P P+ + PN+ F+GG++C KPL +E E +V +SGE+G+WFSLG 
Sbjct: 249 SHASVWLFREDFVLDYPRPIMPNMVFIGGINCANRKPLSQEFEAYVNASGEHGIWFSLG 308 

Query : 311 SMVSNTSEERANVI ASALAK I PQKVLWRFDGNKPDTLGLNTRLYKWI PQNDLLGH PKTKA 37 0 

SMVS E++A IA AL KIPQ VLWR+ G+ + P L NT L KW+PQN LLGHPKT+A 
Sbjct: 309 SMVSEI PEKKAMEIADALGKI PQTVLWRYTGSRPSNLAKNTYLVKWLPQNVLLGHPKTRA 368 

Query: 371 FITHGGMNGIYEAIYHGVPMVGVPIFGDQLDNIAHMKAKGAAVEINFKTMTSEDLLRALR 430 

FITH G +GIYE I +GVPMV +P+FGDQ+DN ++ +GA V +N MTS+DL AL+ 
Sbjct: 3 69 FITHSGSHGIYEGICNGVPMVMLPLFGDQMDNAKRIETRGAGVTLNVLEMTSDDLANALK 428 

Query: 431 TVITDSSYKENAMRLSRIHHDQPVKPLDRAVFWIEFVMRHKGAKHLRSAAHDLTWFQHYS 490 

TVI D SYKEN MRLS +H D+PV+PLD AVFW+ EF VMRHKGA R AAHDLTW+Q+ + S 
Sbjct: 429 TVINDKS YKENIMRLS S LHKDRPVE PLDL AVFWVEF VMRHKG AAP - RPAAHDLTWYQ YHS 487 

Query: 491 IDVIGFLLTCVATAIFLFTKCFLFSCQK-FNKTRKIEK 527 

+DVIGFLL V T F+ KC F+ K F K +++K 
Sbjct: 488 LDVIGFLLAIVLTVAFVTFKCCAFAWGKCFGKKGRVKK 525 
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>g2605508 UDP-glucuronosyl transferase 
Length = 529 

Score = 428 bits (1089), Expect = e-120 

Identities = 228/509 (44%), Positives = 314/509 (60%), Gaps = 17/509 (3%) 

Query: 25 KVLWPTEFSHWMNIKTILDELVQRGHEVTVLASSASISFDPNSPSTLKFEVYPVSLTKT 84 

++LV P + SHW+++K I++ L ++GHE+ V+ + + + T K + + PV + 
Sbjct: 25 RLLVVPQDGSHWLSMKDIVEHLSEKGHEIVVWPEVNLLLQESKHYTRK- - IHPVPFNQE 82 

Query: 85 EFEDIIKQLVKRWAELPKDTF WSYFSQVQEIMWTFNDILRKF--CKDIVSNKKLMKK 139 

EE R+ KFW+VE I F C+ ++ + ++ 

Sbjct: 83 ELE ARYRSFGKHHFSPRWLVTAPWEYRJSnSMIVINMYFLNCQSLLRHSDTLRF 135 

Query: 140 LQESRFDWLADAVFPFGELLAELLKIPFVYSLRFSPGYAIEKHSGGLLFPPSYVPWMS 199 

L+E++FD + D P G +LAE L +P VY R P A+E P SYVP + 

Sbjct: 136 LRENKFDALFTDPALPCGVILAEYLNLPSVYLFRGFP-CALENTFTRTPSPLSYVPRYYT 194 

Query: 2 00 ELSDQMTFIERVKNMIYVTjYFEFWFQIFDMKKWDQFYSEVLGRPTTLSETMAKADIWLIR 259 

+ SD MTF++RV N + V Y E K++ EVLGR L KA IWL+R 

Sbjct: 195 QFSDHMTFLQRVGNFL-VNYLENILLYALYSKYEDLAGEVLGRQVHLPALYRKASIWLLR 253 

Query: 260 NYWDFQFPHPLLPNVEFVGGLHCKPAKPLPKEMEEFVQSSGENGVWFSLGSMVSNTSEE 319 

+ F+ + P P+ + PN +GG CK L +E E +V +SGE+G+WFSLGSMVS E+ 
Sbjct: 254 YDFVFEYPRPVTytPNTVl.IGGSSCKKQGVLSQEFEAYVNASGEHGIVVFSLGSMVSEIPEQ 313 

Query: 320 RAIWIASALAKIPQKVLWRFDGJNTKPDTLGLNTRLYKW^ 379 

+A IA AL KIPQ VLWR+ GPL NT+L KW+PQNDLLGHPKT+AFITH G +G 
Sbjct: 314 KAME I ADALGKI PQTVLWRYTGTP PPNLAKNTKLVKWL PQNDLLGH PKTRAF ITHSGS HG 373 

Query: 380 IYEAIYHGVPMVGVPIFGDQLDNIAHMKAKGAAVEINFKTMTSEDLLRALRTVITDSSYK 439 

IYE I +GVPMV +P+FGDQ+DN M+ +GA V +N M+SEDL +AL+ VI + +YK 
Sbjct: 374 IYEGICNGVPMV^MPLFGDQMDNAKRMETRGAGVTLNVLEMSSEDLEKALKAVINEKTYK 433 

Query: 440 ENAMRLSRIHHDQPVKPLDRAVFWIEFVMRHKGAKHLRSAAHDLTWFQHYSIDVIGFLLT 499 

EN MRLSR+H D+P+ + PLD AVFW+ EFVMRHKG A HLR AAHDLTW+Q+ + S+DVIGFLL 
Sbjct: 434 ENIMRLSRLHKDRPIEPLDLAVFWVEFVMRHKGASHLRPAAHDLTWYQYHSLDVIGFLLA 493 

Query: 500 CVATAIFLFTKCFLFSCQK-FNKTRKIEK 527 

T IF+ K F+ +K F K +++K 
Sbjct: 494 VTLTVI F ITFKAC AF AFRKC FGKKERVKK 522 



> g4760841 sheUGTlA6; UDP-glucuronosyl transferase 
Length = 531 



Score = 421 bits (1072), Expect = e-118 

Identities = 221/504 (43%), Positives = 312/504 (61%), Gaps = 7/504 (1%) 



Query : 


25 


KVLVWPTEFSHWMNIKTILDELVQRGHEVTVLASSASISFDPNSPSTLKFEVYPVSLTKT 


84 






++LV P + SHW+++K I + L ++GHE+ V+ ++ + T + ++PV + 




Sbjct : 


27 


RLLWPQDGSHWLSMKDITERLSEKGHEIVVWPKVNLLLQESKHYTRR--IHPVPYDQE 


84 


Query : 


85 


EFEDIIKQLVKRWAELPKDTFWSYFSQVQEIMWTFNDILRKFCKDIVSNKKLMKKLQESR 


144 






E E + K P+ + + + M N C+ ++ + ++ L+ES + 




Sbjct: 


85 


ELEARYRSFGKHHFS-PRWLWAPMVFlYRNNMIVIlSn^ 


142 


Query : 


145 


FDWLADAVFPFGELLAELLKIPFVYSLRFSPGYAIEKHSGGLLFPPSYVPWMSELSDQ 


204 






FD + D P G +LAE L +P VY R P A+E P SYVP ++ SD+ 




Sbjct: 


143 


FDALFTDPALPCGVILAEYLNLPSVYLFRGFP-CALENTFTRTPSPLSYVPRYYTQFSDK 


201 


Query: 


205 


MTFIERWNMIWLYFEFWFQIFDMKKWDQFY 


264 






MTF++RV N + V Y E K++ EVLGR L KA IWL+R + F 




Sbjct: 


202 


MTFLQRVANF L - VS YLENI LL YAL YSKYEDLAEEVLGRQVHL PALYQICAS I WLLRYDFVF 


260 


Query: 


265 


QFPHPLLPNVEFVGGLHCKPAKPLPKEMEEFVQSSGENGVWFSLGSMVSNTSEERANVI 


324 
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+ + P P+ + PN+ F+GG K LP+E E +V +SGE+G+V+FSLGSMVS E++A I 
Sbjct: 261 EYPRPVMPNMVFIGGSAAKKQGILPREFEAYVNASGEHGIVIFSLGSMVSEIPEQKAMEI 320 

Query: 325 ASALAKI PQKVLWRFDGNKPJJTLGLOTR^ 384 

A AL KIPQ VLWR+ GPL NT+L KW+PQNDLLG PKT+AFITH G +G+YE I 
Sbjct: 321 ADALGKIPQTVLWRYTGTPPPNLAKOTKLVKWLPQNDLLGQPKTRAFITHSG 380 

Query: 385 YHGVPMVGVP I FGDQLDNI AHMKAKGAAVE INFKTMTS EDLLRALRTVITDS S YKENAMR 444 

+GVPMV +P+FGDQ+DN M+ +GA + +N M+S DL AL+ VI + SYKEN MR 
Sbjct: 381 CNGVPMVMMPLFGDQMDNAERMETRGAGITLNVLEMSSGDLENALKAVINEKSYKENIMR 440 

Query: 445 LSRIHHDQPVKPLDRAVFWIEFVMRHKGAKHLRSAAHDLTWFQHYSIDVIGFLLTCVATA 504 

LSR+H D+P++PLD AVFW+ EFVMRHKGA HLR AAHDLTW+Q++S+DVIGFLL T 
Sbjct: 441 LSRLHKDRPIEPLDLAVFWVEFVMRHKGASHLRPAAHDLTWYQYHSLDVIGFLLAVTLTV 500 

Query: 505 IFLFTKCFLFSCQK-FNKTRKIEK 527 

IF+ K F+ +K F K +++K 
Sbjct: 501 IFITFKACAFTFRKCFGKKERVKK 524 

Database: gbl37mamp 

Posted date: Sep 11, 2003 11:25 AM 
Number of letters in database: 6,697,212 
Number of sequences in database: 27,779 

Lambda K H 

0.324 0.138 0.428 

Gapped 

Lambda K H 

0.270 0.0470 0.230 



Matrix: BLOSUM62 

Gap Penalties: Existence: 11, Extension: 1 
Number of Hits to DB: 7447897 
Number of Sequences: 27779 
Number of extensions: 323061 
Number of successful extensions: 685 
Number of sequences better than 10.0: 33 
Number of HSP's better than 10.0 without gapping: 18 
Number of HSP's successfully gapped in prelim test: 15 
Number of HSP's that attempted gapping in prelim test: 623 
Number of HSP's gapped (non-prelim): 35 
length of query: 529 
length of database: 6,697,212 
effective HSP length: 46 
effective length of query: 483 
effective length of database: 5,419,378 
effective search space: 2617559574 
effective search space used: 2617559574 
T: 11 
A: 40 
XI 
X2 



X3 
SI 



15 ( 7.0 bits) 

38 (14.8 bits) 

64 (24.9 bits) 

41 (22.0 bits) 



NCBI-BLASTP 2.0.10 [ Aug-26-1999 ] 

Reference: Altschul, Stephen F., Thomas L. Madden, Alejandro A. Schaffer, 
Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997) , 



El 
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782 


0, 


.0 


779 


0. 


.0 


772 


0. 


.0 


772 


0. 


,0 


771 


0. 


.0 


755 


0. 


.0 


747 


0. 


.0 


747 


0, 


.0 


745 


0. 


.0 


742 


0, 


.0 



"Gapped BLAST and PSI-BLAST: a new generation of protein database search 
programs", Nucleic Acids Res. 25:3389-3402. 

Query = 291233 0CD1 

(529 letters) 

Database : gbl37rodp 

74,095 sequences; 25,169,402 total letters 

Searching done 

Score E 

Sequences producing significant alignments: (bits) Value 

El g20071113 RIKEN cDNA 1300012D20 

W: g20380046 expressed sequence AI788959 

Wk g207581 UDP-glucuronosyl transferase (EC 2.4.1.17) 

^ 9207569 UDP glucuronosyl trans ferase-2 

F! g!8146841 UGT2B21; UDP-glucuronosyltransf erase 2B21; EC 2.7.1 

F5 g458395 UDP-glucuronosyltransf erase; EC 2.4.1.17 

W g458397 UDP-glucuronosyltransf erase; EC 2.4.1.17 

S g20381430 UDP-glucuronosyltransf erase 2 family, member 5 

^ g55120 unnamed protein product; UDP-glucuronosyltransf erase p 

13 g!5929692 RIKEN cDNA 9430041C03 

> g20071113 RIKEN cDNA 1300012D20 
Length = 529 

Score = 782 bits (1998), Expect = 0.0 

Identities = 369/528 (69%), Positives = 431/528 (80%), Gaps = 1/528 (0%) 

MSMKWTSALLLIQLSCYFSSGSCGKVLVWPTEFSHWMNIKTILDELVQRGHEVTVLASSA 6l 
MSMK S LLIQ CY G+CGKVLVWPTE+SHW+N+K I LDELVQRGH+VTVL SSA 



SI P+ + S++ FE+Y L+K + E + + V W EL K FW+ +S++Q+I 



+D+ + FCK +V NK LMKKLQ S+FDWLADA+ P GELL+ELLK P VYSLRF PGY 



EK+SGGL PPSYVPW+SELSD MTF ERVKNM+ VL F+FWFQ F+ K W+QFYS+V 



LGRPTTL+E M KADIWL+R +WD +FPHP LPN +FVGGLHCKPAKPLPKEMEEFVQSS 



GE+GVWFSLGSMV N EE+ANV+ASALA+IPQKVLWRFDG KPDTLG NTRLYKWIPQ 



NDLLGH PKTKAF I HGG NGI YEAIYHG+ P+VG+ P'+FGDQ DNI H+ AKGAAV ++F T 



M++ DLL AL+TVI D SYKENAMRLSRIHHDQP+KPLDRAVFWIE+VMR+KGAKHLR A 



Query: 


1 


Sbjct : 


1 


Query : 


61 


Sbjct: 


61 


Query : 


120 


Sbjct : 


121 


Query : 


180 


Sbjct: 


181 


Query: 


240 


Sbjct: 


241 


Query : 


300 


Sbjct: 


301 


Query: 


360 


Sbjct: 


361 


Query: 


420 


Sbjct: 


421 
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Query: 480 AHDLTWFQHYSIDVIGFLLTCVATAIFLFTKCFLFSCQKFNKTRKIEK 527 

HDLTWFQ++S+DVIGFLL CV +F+ KC LF C K K +K 

Sbjct: 481 LHDLTWFQYHSLDVIGFLLVCWAWFIIAKCCLFCCHKTANMGKKKK 528 



> g20380046 expressed sequence AI788959 
Length = 532 

Score = 779 bits (1990), Expect =0.0 

Identities = 365/532 (68%), Positives = 443/532 (82%), Gaps = 3/532 (0%) 

Query: 1 MSMKWTSALLLI--QLSCYFSSGSCGKVLWPTEFSHWMNIKTILDELVQRGHEVTVLAS 58 

M +K T+ALLL+ QLS +F SG+ GKVLVWP EFSHW+N+KTILDEL+++GHEV VL 
Sbjct: 1 MPVXMTAALLLLLLQLSGFFGSGTGGKVLWPMEFSHWLNLKTILDELLKKGHEVWLRP 60 

Query: 59 SASISFDPNSPSTLKFEVYPVSLTKTEFEDIIKQLVKRWA-ELPKDTFWSYFSQVQEIMW 117 

SAS+S++ ++ S ++FE YP S + +E E+I + +K+ + ELPK +FW YF +QE++W 
Sbjct : 61 SASLSYEVDNTSAIEFETYPTSYSLSELEEIFWESLKKYIYELPKQSFWGYFLMLQEMVW 120 

Query: 118 TFNDILRKFCKDIVSNKKLMKKLQESRFDWLADAVFPFGELLAELLKIPFVYSLRFSPG 177 

+ CKD+V NK+LM KLQ+SRFDV+LAD P G+LLAE+LKIP VYSLRF PG 

Sbjct: 121 VX>SKYFESLCKDWFNKELMTKLQKSRFDVILADPFIPCGDLLAEVLKIPLVYSLRFFPG 180 

Query: 178 YAIEKHSGGLLFPPSWPVVMSELSDQMTFIERVKNMIYVLYFEFWFQIFDMKKWDQFYS 237 

EK+SGGL PPSYVPWMSELSD+MTF+ERV+N+IY+L F+FWFQ F+ K W+Q Y+ 
Sbjct: 181 STYEKYSGGLPLPPSWPVVMSELSDRMTFMERVRIWIYMLCFDFWFQTFNEKNWNQLYT 240 

Query: 238 EVLGRPTTLSETMAKADIWLIRNYWDFQFPHPLLPNVEFVGGLHCKPAKPLPKEMEEFVQ 297 

EVLGRPTTLSETMAKADIWLIR YWD +FPHP+LPN +F+GGLHC+PAKPLPKE+E+FVQ 
Sbjct: 241 EVLGRPTTLSETMAKADIWLIRTYWDLEFPHPVLPNFDFIGGLHCRPAKPLPKEIEDFVQ 300 

Query: 298 SSGENGVWFSLGSWSNTSEERANVIASALAKIPQKVLWRFDGNKPDTLGLNTRLYKWI 357 

SSGE+GVWFSLGSMV + +EERANVIA+ LA+ 1 PQKVLWRF +G KP+TLG NTRLYKWI 
Sbjct: 301 SSGEHGVWFSLGSMVGSITEERANVIAAGLAQIPQKVLWRFEGKKPETLGSNTRLYKWI 360 

Query: 358 PQNDLLGHPKTKAFITHGGMNGIYEAIYHGVPWGVPIFGDQLDNIAHMKAKGAAVEINF 417 

PQNDLLGH KT+AFITHGG NGIYEAIYHG+P+VG+P+FGDQ DNI H+KAKGAAV ++F 
Sbjct: 361 PQNDLLGHSKTRAF ITHGGTNG I YE AI YHG I PWG I PLFGDQ YDNI VHLKAKG AAVRLDF 420 

Query: 418 KTMTSEDLLRALRTVITDSSYKENAMRLSRIHHDQPVKPLDRAVFWIEFVMRHKGAKHLR 477 

TM+S DL AL+TV D SYKENAMRLSRIHHDQPVKPLDRAVFWIEFVMRHKGAKHLR 
Sbjct: 421 LTMSSTDLHTALKTVTISnDPSYKENAMRLSRIHHDQPVKPLDRAWWIEFVMRHKGAKHLR 480 

Query: 478 S AAHDLTWFQHYS I DVIGF LLTC VATAI FLFTKC FLF S CQKFNKTRK I EKRE 529 

AAHDL+W Q++S+DV+GFLL CV T +F+ KC LF CQK K + +K E 
Sbjct: 481 VAAHDLSWVQYHSLDVLGFLLACVLTVMFILKKCCLFCCQKLTKAGRKKKGE 532 



> g207581 UDP-glucuronosyl transferase (EC 2.4.1.17) 
Length =529 

Score = 772 bits (1972), Expect = 0.0 

Identities = 361/528 (68%), Positives = 429/528 (80%), Gaps = 1/528 (0%) 

Query: 1 MSMKWTSALLLIQLSCYFSSGSCGKVLWPTEFSHWMNIKTILDELVQRGHEVTVLASSA 60 

MSMK TS LLIQL CYF G+CGKVLVWPTE+SHW+NIK IL+EL QRGHEVTVL SSA 
Sbjct : 1 MSMKQTSVFLLIQLICYFRPGACGKVLVWPTEYSHWINIKIILNELAQRGHEVTVLVSSA 60 

Query: 61 S I SFDPNS PSTLKFEVYPVSLTKTEFEDI I KQLVKRWA - ELPKDTFWS YFSQVQE IMWTF 119 

SI +P S++ FE+Y V L+K++ E + + W + + W+Y+S++Q++ + 
Sbjct: 61 SILIEPTKESSINFEIYSVPLSKSDLEYSFAKWIDEWTRDFETLSIWTYYSKMQKVFNEY 120 

Query: 120 NDILRKFCKDIVSNKKLMKKLQESRFDWLADAVFPFGELLAELLKIPFVYSLRFSPGYA 179 

+D++ CK ++ NK LMKKLQ S+FDV+LADAV P GELLAELLK P VYSLRF PGY 
Sbjct: 121 SDWENLCKALIWNKSLMKKLQGSQFDVILADAVGPCGELLAELLKTPLVYSLRFCPGYR 180 

Query: 180 IEKHSGGLLFPPSWPVWSELSDQMTFIERVKNMIWLYFEFWFQIFDMKKWDQFYSEV 239 
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EK SGGL PPSYVPW+SELSD+MTF+ERVKNM+ +LYF+FWFQ F K W QFYS+V 
Sbjct: 181 CEKFSGGLPLPPSWPVVLSELSDRMTFVERVKNMLQMLYFDFWFQPFKEKSWSQFYSDV 240 

Query: 240 LGRPTTLSETMAKADIWLIRNYWDFQFPHPLLPNVEFVGGLHCKPAKPLPKEMEEFVQSS 299 

LGRPTTL+E M KADIWLIR +WD +FPHP LPN +FVGGLHCKPAKPLP+EMEEFVQSS 
Sbjct: 241 LGRPTTLTEMMGKADIWLIRTFWDLEFPHPFLPNFDFVGGLHCKPAKPLPREMEEFVQSS 300 

Query: 300 GENGVWFSLGSWSNTSEERANVIASALAKIPQKVLWRFDGNKPDTLGLNTRLYKWIPQ 359 

GE+GVWFSLGSMV N + EE + ANV+ AS ALA+ 1 PQKV+ WRFDG KPDTLG NTRLYKWIPQ 
Sbjct: 301 GEHGVWFSLGSMVKNLTEEKANWASALAQI PQKWWRFDGKKPDTLGSNTRLYKWI PQ 360 

Query: 360 NDLLGHPKTKAF ITHGGMNG I YEAI YHGVPMVGVP I FGDQLDNI AHMKAKGAAVE INFKT 419 

NDLLGHPKTKAF+ HGG NGIYEAIYHG+P+VG+P+F DQ DNI HM AKGAAV ++F 
Sbjct: 361 NDLLGHPKTKAFVAHGGTNGIYEAIYHGI PIVGI PLFADQPDNINHMVAKGAAVRVDFSI 420 

Query: 420 MTSEDLLRALRTVITDSS YKENAMRLS RIHHDQPVKPLDRAVFWIEFVMRHKGAKHLRS A 479 

+ + + LL AL+ V+ D S YKENAMRLS RI HHDQPVKPLDRAVFWI E + VMRHKGAKHLRS 
Sbjct: 421 LSTTGLLTALKIVMNDPSYKENAMRLSRI HHDQPVKPLDRAVFWI EYVMRHKGAKHLRST 480 

Query: 480 AHDLTWFQHYSIDVIGFLLTCVATAIFLFTKCFLFSCQKFNKTRKIEK 527 

HDL+WFQ++S+DVIGFLL CV +F+ TK LF C+K K +K 

Sbjct: 481 LHDLSWFQYHSLDVIGFLLLCWGWFIITKFCLFCCRKTANMGKKKK 528 



> g207569 UDP glucuronosyltransf erase-2 
Length = 529 

Score = 772 bits (1972), Expect = 0.0 

Identities = 361/528 (68%), Positives = 429/528 (80%), Gaps = 1/528 (0%) 

Query: 1 MSMKWTSALLLIQLSCYFSSGSCGKVLVWPTEFSHWMNIKTILDELVQRGHEVTVLASSA 60 

MSMK TS LLIQL CYF G+CGKVLVWPTE+SHW+NIK IL+EL QRGHEVTVL SSA 
Sbjct : 1 MSMKQTSVFLLIQLICYFRPGACGKVLVWPTEYSHWINIKIILNELAQRGHEVTVLVSSA 60 

Query: 61 SISFDPNSPSTLKFEVYPVSLTKTEFEDIIKQLVKRWA-ELPKDTFWSYFSQVQEIMWTF 119 

SI +P S++ FE+Y V L+K+ + E + + W + + W+Y+S++Q++ + 

Sbjct : 61 SILIEPTKESSINFEIYSVPLSKSDLEYSFAKWIDEWTRDFETLSIWTYYSKMQKVFNEY 120 

Query: 120 NDILRKFCKDIVSNKKLMKKLQESRFDWLADAVFPFGELLAELLKIPFVYSLRFSPGYA 179 

+D++ CK ++ NK LMKKLQ S+FDV+LADAV P GELLAELLK P VYSLRF PGY 
Sbjct: 121 SDWENLCKALIWNKSLMKKLQGSQFDVILADAVGPCGELLAELLKTPLVYSLRFCPGYR 180 

Query: 180 IEKHSGGLLFPPSWPVVMSELSDQMTFIERVKNMIYVLYFEFWFQIFDMKKWDQFYSEV 239 

EK SGGL PPSYVPW+SELSD+MTF+ERVKNM+ +LYF+FWFQ F K W QFYS+V 
Sbjct: 181 CEKFSGGLPLPPSYVPWLSELSDRMTFVERVKNMLQMLYFDFWFQPFKEKSWSQFYSDV 240 

Query: 240 LGRPTTLSETMAKADIWLIRNYWDFQFPHPLLPNVEFVGGLHCKPAKPLPKEMEEFVQSS 299 

LGRPTTL+E M KADIWLIR +WD +FPHP LPN +FVGGLHCKPAKPLP+EMEEFVQSS 
Sbjct: 241 LGRPTTLTEMMGKADIWLIRTFWDLEFPHPFLPNFDFVGGLHCKPAKPLPREMEEFVQSS 300 

Query: 300 GENGVWFSLGSMVSNTSEERANVI ASALAKI PQKVLWRFDGNKPDTLGLNTRLYKWI PQ 359 

GE+GVWFSLGSMV N +EE+ANV+ASALA+IPQKV+WRFDG KPDTLG NTRLYKWIPQ 
Sbjct: 301 GEHGVVVFSLGSMVKNLTEEKANVVASALAQI PQKWWRFDGKKPDTLGSNTRLYKWI PQ 360 

Query: 360 NDLLGHPKTKAF ITHGGMNG IYEAIYHGVPMVGVPI FGDQLDNI AHMKAKGAAVE INFKT 419 

NDLLGHPKTKAF + HGG NGIYEAIYHG+P+VG+P+F DQ DNI HM AKGAAV ++F 
Sbjct: 361 NDLLGHPKTKAFVAHGGTNGIYEAIYHGI PIVGI PLFADQPDNINHMVAKGAAVRVDFSI 420 

Query: 420 MTSEDLLRALRTVITDSS YKENAMRLS RIHHDQPVKPLDRAVFWIEFVMRHKGAKHLRS A 479 

+ + + LL AL+ V+ D SYKENAMRLSRI HHDQPVKPLDRAVFWI E+VMRHKGAKHLRS 
Sbjct: 421 LSTTGLLTALKIVMNDPSYKENAMRLSRI HHDQPVKPLDRAVFWI EYVMRHKGAKHLRST 480 

Query: 480 AHDLTWFQHYSIDVIGFLLTCVATAIFLFTKCFLFSCQKFNKTRKIEK 527 

HDL+WFQ++S+DVIGFLL CV +F+ TK LF C+K K +K 

Sbjct: 481 LHDLSWFQYHSLDVIGFLLLCWGWFIITKFCLFCCRKTANMGKKKK 528 
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> g!8146841 UGT2B21; UDP-glucuronosyl transferase 2B21; EC 2.7.1.17; 
UDP-glycosyl transferase 
Length = 528 

Score = 771 bits (1969), Expect = 0.0 

Identities = 361/528 (68%), Positives = 434/528 (81%), Gaps = 1/528 (0%) 

Query: 3 MKWTSALLLIQLSCYFSSGSCGKVLVWPTEFSHWMNIKTILDELVQRGHEVTVLASSASI 62 

MK ALLL+QL C+F SGSCGKVLVWP EFSHWMNI+ IL+EL++RGHEVTVL S I 
Sbjct: 1 MKRILALLLLQLCCHFHSGSCGWLWPMEFSHWMNIQVILEELIRRGHEVTVLRPSCFI 60 

Query: 63 SFDPNSPSTLKFEVYPVSLTKTEFEDIIKQLVKRWAELPK-DTFWSYFSQVQEIMWTFND 121 

D N+ S +KFE + S T+ +E I LV W DT YF +V+ + + F+D 

Sbjct: 61 FVDVNTTSEIKFETFHTSFTRDYWEKIFTDLVTTWLNTGSVDTCLDYFPEVEKLFKHFSD 120 

Query: 122 ILRKFCKDIVSNKKLMKKLQESRFDWLADAVFPFGELLAELLKIPFVYSLRFSPGYAIE 181 

CK++VSNKK MK LQESRFD++LADAV P GEL+AE+L IPFVYSLRFSPG+ E 
Sbjct: 121 EWENVCKELVSNKKFMKNLQESRFDILLADAVGPCGELVAEILHIPFVYSLRFSPGFQAE 180 

Query: 182 KHSGGLLFPPSWPVVMSELSDQMTFIERVKlSnyilYVLYFEFWFQIFDMKKWDQF 241 

K +GGLL PPSYVPV+MS LS +MTF+ERVKNMI +LYF+FWF+ FD K+WD+ YSE+LG 
Sbjct: 181 KRAGGLLLPPSWPVIMSGLSGEMTFMERVKlSnyilCMLYFDFWFETFDEKRWDKLYSEILG 240 

Query: 242 RPTTLSETMAKADIWLIRNYWDFQFPHPLLPNVEFVGGLHCKPAKPLPKEMEEFVQSSGE 301 

+ P+TL ETM+KAD+WLIR+YWD +FPHP LPN + + +GGLHCKPAKPLPKEMEEFVQSSGE 
Sbjct: 241 KPSTLYETMSKADMWLIRSYWDMEFPHPSLPNFDYIGGLHCKPAKPLPKEMEEFVQSSGE 300 

Query: 302 NGVWFSLGSMVSNTSEERANVI AS ALAKI PQKVLWRFDGNKPDTLGLNTRLYKWI PQND 361 

+G+WFSLGSM+ N + +E+AN+IASAL + 1 PQKVLWRFDG KPDTLG NTRLYKWI PQND 
Sbjct: 301 HGI WFSLGSMIRNMTDEKANLI ASALGQI PQKVLWRFDGKKPDTLGANTRLYKWI PQND 360 

Query: 362 LLGHPKTKAFITHGGMNGIYEAIYHGVPMVGVPIFGDQLDNIAHMKAKGAAVEINFKTMT 421 

LLGHPKT+AFITHGG NGIYEAIYHG+PMVG+P+FG+Q DNI AHMKAKGAA+ + + F +++ 
Sbjct: 361 LLGHPKTRAFITHGGANGIYEAIYHGIPMVGLPLFGEQYDNIAHMKAKGAAMKLEFNSLS 420 

Query: 422 SEDLLRALRTVITDSSYKENAMRLSRIHHDQPVKPLDRAVFWIEFVMRHKGAKHLRSAAH 481 

S DLL AL+TVI + SYKENAM LS IHHDQP+KPLDRAVFWIE+VM+HKGAKHLR AH 
Sbjct: 421 STDLLNALKTVINNPSYKENAMWLSTIHHDQPMKPLDRAVFWIEYVMQHKGAKHLRPLAH 480 



Query: 482 DLTWFQHYSIDVIGFLLTCVATAIFLFTKCFLFSCQKFNKTRKIEKRE 529 

+LTW+Q+ + S+DVIGFLL CVA FL KC LF QKF +T K +KRE 
Sbjct: 481 NLTWYQYHSLDVIGFLLACVAAITFLIIKCCLFCFQKFMETGKKKKRE 528 



> g458395 UDP-glucuronosyl transferase ; EC 2.4.1.17 
Length = 530 



Score = 755 bits (1929), Expect = 0.0 

Identities = 354/530 (66%), Positives = 427/530 (79%), Gaps = 1/530 (0%) 

MSMKWTSALLLIQLSCYFSSGSCGKVLWPTEFSHWMNIKTILDELVQRGHEVTVLASSA 60 
MS KW SALLL+Q+S F SG+CGKVLVWP E+SHWMNIK IL+ELVQ+GHEVTVL SA 
MSGKWISALLLLQISFCFKSGNCGKVLVWPMEYSHWMNIKIILEELVQKGHEVTVLRPSA 60 

SISFDPNSPSTLKFEVYPVSLTKTEFEDIIKQLVKRWA-ELPKDTFWSYFSQVQEIMWTF HE 
+ DP S LKF +P S + + E+ + V W ELP+DT SYF +Q+ + + 



Query : 


1 


Sbjct: 


1 


Query: 


61 


Sbjct: 


61 


Query: 


120 


Sbjct: 


121 


Query : 


180 


Sbjct: 


181 


Query: 


240 



+D CK+ VSNK+ M KLQES+FDW +DA+ P GEL+AELL+IPF+YSLRFSPGY 

SDYCLTVCKEAVSNKQFMTKLQESKFDWFSDAIGPCGELIAELLQIPFLYSLRFSPGYT 180 

IEKHSGGLLFPPSYVPWMSELSDQMTFIERVKNMIYVLYFEFWFQIFDMKKWDQFYSEV 239 
IE++ GG+LFPPSYVP++ S L+ QMTFIERV NMI +LYF+FWFQ F KKWD FYS+ 
IEQYIGGVLFPPSYVPMIFSGLAGQMTFIERVHNMICMLYFDFWFQTFREKKWDPFYSKT 240 

Query: 240 LGRPTTLSETMAKADIWLIRNYWDFQFPHPLLPNVEFVGGLHCKPAKPLPKEMEEFVQSS 299 
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LGRPTTL+E M KA++WLIR+YWD +FPHP+ PNV+++GGLHCKPAKPLPK++E+FVQSS 
Sbjct: 241 LGRPTTLAEIMGKAEMWLIRSYWDLEFPHPISPNVDYIGGLHCKPAKPLPKDIEDFVQSS 300 

Query: 300 GENGVWFSLGSMVSNTSEERANVI ASALAKI PQKVLWRFDGNKPDTLGLNTRLYKWI PQ 359 

GE+GVWFSLGSMV N +EE+AN+IA ALA+ 1 PQKVLWRFDG KP TLG NTRLYKW+PQ 
Sbjct: 301 GEHGWVFSLGSMVRNMTEEKANI IAWALAQI PQKVLWRFDGKKPPTLGPNTRLYKWLPQ 360 

Query: 360 NDLLGHPKTKAFITHGGMNGIYEAIYHGVPMVGVPIFGDQLDNIAHMKAKGAAVEINFKT 419 

NDLLGHPKTKAF+THGG NGIYEAI+HG+PM+G+P+F +Q DNIAHM AKGAAVE +NF +T 
Sbjct: 361 NDLLGHPKTKAFVTHGGANGIYEAIHHGIPMIGIPLFAEQHDNIAHMVAKGAAVEVNFRT 420 

Query: 420 MTSEDLLRALRTVITDSSYKENAMRLSRIHHDQPVKPLDRAVFWIEFVMRHKGAKHLRSA 479 

M+ DLL AL VI + YK+NAM LS IHHDQP KPLDRAVFWIEFVMRHKGAKHLRS 
Sbjct: 421 MSKSDLLNALEEVIDNPFYKKNAMWLSTIHHDQPTKPLDRAVFWIEFVMRHKGAKHLRSL 480 

Query: 480 AHDLTWFQHYSIDVIGFLLTCVATAIFLFTKCFLFSCQKFNKTRKIEKRE 529 

H+L W+Q++S+DVIGFLL+CVA + L KCFLF + F K K K E 
Sbjct: 481 GHNLPWYQYHSLDVIGFLLSCVAVTWLALKCFLFVYRFFVKKEKKTKNE 53 0 



> g4 58397 UDP-glucuronosyl trans f erase ; EC 2.4.1.17 
Length = 530 

Score = 747 bits (1908), Expect = 0.0 

Identities = 351/530 (66%), Positives = 423/530 (79%), Gaps = 1/530 (0%) 

Query: 1 MSMKWTSALLLIQLSCYFSSGSCGKVLWPTEFSHWMNIKTILDELVQRGHEVTVLASSA 60 

M KW SALLL+Q+S F SG+CGKVLVWP E+SHWMNIK IL+ELVQ+GHEVTVL SA 
Sbjct : 1 MPGKWISALLLLQISFCFKSGNCGKVLVWPMEYSHWMNIKIILEELVQKGHEVTVLRPSA 60 

Query : 61 SISFDPNSPSTLKFEVYPVSLTKTEFEDIIKQLVKRWA-ELPKDTFWSYFSQVQEIMWTF 119 

S+ DP S LKF + P S + + E+ + V W ELP+DT SYF +Q+ + + 
Sbjct : 61 SVFLDPKETSHLKFVTFPTSFSSHDLENFFTRFVSVWTYELPRDTCLSYFLYLQDTIDEY 120 

Query: 120 NDILRKFCKDIVSNKKLMKKLQESRFDWLADAVFPFGELLAELLKIPFVYSLRFSPGYA 179 

+D CK+ VSNK+ M KLQES + FDW +DA+ P GEL+AELL+IPF+YSLRFSPGY 

Sbjct: 121 SDYCLTVCKEAVSNKQFMTKLQESKFDWFSDAIGPCGELIAELLQIPFLYSLRFSPGYT 180 

Query: 180 IEKHSGGLLFPPSYVPWMSELSDQMTFIERVKNMIYVLYFEFWFQIFDMKKWDQFYSEV 239 

IEK+ GG+LFPPSYVP++ S L+ QMTFIERV NMI +LYF+FWFQ F KKWD FYS+ 
Sbjct: 181 IEKYIGGVLFPPSYVPMIFSGLAGQMTFIERVHNMICMLYFDFWFQTFREKKWDPFYSKT 240 

Query: 240 LGRPTTLSETMAKADIWLIRNYWDFQFPHPLLPNVEFVGGLHCKPAKPLPKEMEEFVQSS 299 

LGRPTTL+E M KA++WLIR+YWD +FPHP+ PNV+++GGLHCKPAKPLPK++E+FVQSS 
Sbjct: 241 LGRPTTLAEIMGKAEMWLIRSYWDLEFPHPISPNVDYIGGLHCKPAKPLPKDIEDFVQSS 300 

Query: 3 00 GENGVWFSLGSMVSNTSEERANVI ASALAKI PQKVLWRFDGNKPDTLGLNTRLYKWI PQ 359 

GE+GVWFSLGSMV N +EE+AN+IA ALA+ 1 PQKVLWRFDG KP TLG NTRLYKW+ PQ 
Sbjct: 301 GEHGWVFSLGSMVRNMTEEKANI IAWALAQI PQKVLWRFDGKKPTTLGPNTRLYKWLPQ 360 

Query: 360 NDLLGHPKTKAFITHGGMNGIYEAIYHGVPMVGVPIFGDQLDNIAHMKAKGAAVEINFKT 419 

NDLLGHPKTKAF+THGG NGIYEAI+HG+PM+G+P+FG+Q DNIAHM AKGAA +NF+T 
Sbjct: 361 NDLLGHPKTKAFVTHGGANGI YEAIHHGI PMIGI PLFGEQHDNIAHMVAKGAAATVNFRT 420 

Query: 420 MTSEDLLRALRTVITDSSYKENAMRLSRIHHDQPVKPLDRAVFWIEFVMRHKGAKHLRSA 479 

M+ DLL AL I + YK+NAM LS IHHDQP KPLDRAVFWIEFVMRHKGA HLRS 
Sbjct: 421 MSKSDLLNALEEDIDNPFYKKNAMWLSTIHHDQPTKPLDRAVFWIEFVMRHKGALHLRSL 480 

Query: 480 AHDLTWFQHYSIDVIGFLLTCVATAIFLFTKCFLFSCQKFNKTRKIEKRE 529 

H+L W+ ++S+DVIGFLL+CVA + L KCFLF + F K K K E 
Sbjct: 481 GHNLPWYLYHSLDVIGFLLSCVAVTWLALKCFLFVYRFFVKKEKKTKNE 53 0 



> g20381430 UDP-glucuronosyl transferase 2 family, member 5 
Length = 530 

Score = 747 bits (1907), Expect = 0.0 
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Identities = 355/530 (66%), Positives = 421/530 (78%), Gaps = 1/530 (0%) 

Query: 1 MSMKWTSALLLIQLSCYFSSGSCGKVLVWPTEFSHWMNIKTILDELVQRGHEVTVLASSA 60 

M KW SALLL+Q+SC F S CGKVLVWP EFSHWMNIK ILDELVQRGHEVTVL SA 
Sbjct: 1 MPGKWI S ALLLLQI SCCFRS WCGK VIjVWPMEFSHWMNIKI I LDELVQRGHEVTVLRPS A 60 

Query: 61 S I SFDPNS PSTLKFEVYPVSLTKTEFEDI I KQLVKRWA- ELPKDTFWS YF SQVQEIMWTF 119 

DP LKFE +P S++K E+ + V W E+P+DT SY +Q + + F 

Sbjct : 61 YYVLDPKKSPGLKFETFPTSVSKDNLENFFIKFVIJVWTYEMPRDTCLSYSPLLQNMIDEF 120 

Query: 120 NDILRKFCKDIVSNKKLMKKLQESRFDWLADAVFPFGELLAELLKIPFVYSLRFSPGYA 179 

+D CKD+VSNK+LM KLQES+FDV+L+D V GEL+AELL+IPF+YS+RFSPGY 

Sbjct: 121 SDYFLSLCKDWSNKELMTKLQESKFDVLLSDPVASCGELIAELLQIPFLYSIRFSPGYQ 180 

Query: 180 IEKHSGGLLFPPSWPWMSELSDQMTFIERVTCNM 239 

IEK SG L PPSYVPV++S L QMTFIER+KNMI +LYF+FWFQ+F+ KKWD FYSE 
Sbjct: 181 IEKSSGRFLLPPSYVPVILSGLGGQMTFIERIKNMICMLYFDFWFQMFNDKKWDSFYSEY 240 

Query: 240 LGRPTTL SETMAKADIWL I RNYWDFQFPHPLLPNVEFVGGLHCKPAK PL PKEMEEFVQSS 299 

LGRPTTL ETM +A++WLIR+ WD +FPHP LPNV++VGGLHCKPAKPLPK+MEEFVQSS 
Sbjct: 241 LGRPTTLVETMGQAEMWLIRSNWDLEFPHPTLPNVDYVGGLHCKPAKPLPKDMEEFVQSS 300 

Query: 3 00 GENGVVVFSLGSMVSNTSEERANVI ASALAKI PQKVLWRFDGNKPDTLGLNTRLYKWI PQ 359 

G + +GVWF S LGSMVSN +EE+AN IA ALA+IPQKVLW+FDG P TLG NTR+YKW+PQ 
Sbjct: 301 GDHGVVVFSLGSWSNMTEEKANAIAWALAQIPQK 360 

Query: 360 NDLLGHPKTKAFITHGGMNGIYEAIYHGVPMVGVPIFGDQLDNIAHMKAKGAAVEINFKT 419 

NDLLGH PKTKAF +THGG NG+YEAIYHG+PM+G+P+FG+Q DNIAHM AKGAAV +N +T 
Sbjct: 361 NDLLGHPKTKAFVTHGGANGVYEAIYHGIPMIGIPLFGEQHDNIAHMVAKGAAVALNIRT 42 0 

Query: 420 MTSEDLLRALRTVITDSSYKENAMRLSRIHHDQPVKPLDRAVFWIEFVMRHKGAKHLRSA 47 9 

M+ D+L AL VI + YK+NAM LS IHHDQP+KPLDRAVFWIEFVMRHK AKHLR 
Sbjct: 421 MSKSDVLNALEEVIENPFYKKNAMWLSTIHHDQPMKPLDRAVFWIEFVMRHKRAKHLRPL 480 

Query: 480 AHDLTWFQHYSIDVIGFLLTCVATAIFLFTKCFLFSCQKFNKTRKIEKRE 529 

H+LTW+Q++S+DVIGFLL+CVAT ILKCLF+FK KE 
Sbjct: 481 GHNLTWYQYHSLDVIGFLLSCVATTIVLSVKCLLFIYRFFVKKENKMKNE 53 0 



>g55120 unnamed protein product; UDP-glucuronosyltransf erase 
precursor (530 AA) 
Length = 530 

Score = 745 bits (1903), Expect = 0.0 

Identities = 354/530 (66%), Positives = 421/530 (78%), Gaps = 1/530 (0%) 

MSMKWTS ALLLIQLSCYFSSGSCGKVLVWPTEFSHWMNIKTILDELVQRGHEVTVLASSA 6 0 
M KW SALLL+Q+SC F S CGKVLVWP EFSHWMNIK ILDELVQRGHEVTVL SA 
MPGKWISALLLLQISCCFRSVKCGKVLVWPMEFSHWMNIKI I LDELVQRGHEVTVLRPS A 60 

SI SFDPNS PSTLKFEVYPVSLTKTEFEDI IKQLVKRWA- ELPKDTFWS YF SQVQEIMWTF 115 
DP LKFE +P S++K E+ + V W E+P+DT SY +Q ++ F 



+D CKD+VSNK+LM KLQES+FDV+L+D V GEL+AELL+IPF+YS+RFSPGY 

SDYFLSLCKDWSNKELMTKLQESKFDVLLSDPVASCGELIAELLQIPFLYSIRFSPGYQ 180 

IEKHSGGLLFPPSYVPVVMSELSDQMTFIERVKNMIYVLYFEFWFQIFDMKKWDQFYSEV 239 
IEK SG L PPSYVPV++S L QMTFIER+KNMI +LYF+FWFQ+F+ KKWD FYSE 

IEKSSGRFLLPPSYVPVILSGLGGQMTFIERIKNMICMLYFDFWFQMFNDKKWDSFYSEY 240 

LGRPTTLSETMAKADIWLIRNYWDFQFPHPLLPNVEFVGGLHCKPAKPLPKEMEEFVQSS 299 
LGRPTTL ETM +A++WLIR+ WD +FPHP LPNV++VGGLHCKPAKPLPK+MEEFVQSS 

LGRPTTLVETMGQAEMWL I RSNWDLEF PHPTLPNVDYVGGLHC KPAKPL PKDMEEFVQ S S 300 

Query: 300 GENGVWFS LGSMVSNTSEERANVI ASALAKI PQKVLWRFDGNKPDTLGLNTRLYKWI PQ 359 



Query: 


1 


Sbjct: 


1 


Query: 


61 


Sbjct: 


61 


Query: 


120 


Sbjct: 


121 


Query : 


180 


Sbjct: 


181 


Query : 


240 


Sbjct: 


241 


Query: 


300 
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G++GVWFSLGSMVSN +EE+AN IA ALA+IPQKVLW+FDG P TLG NTR+YKW+PQ 
Sbjct: 301 GDHGVWFSLGSMVSNMTEEKANAIAWALAQIPQKVLWKFDGKTPATLGHNTRVYKWLPQ 360 

Query: 360 NDLLGHPKTKAFITHGGMNGIYEAIYHGVPMVGVPIFGDQLDNIAHMKAKGAAVEINFKT 419 

NDLLGHPKTKAF+THGG NG+YEAI YHG+ PM+G+ P+FG+Q DNIAHM AKGAAV +N +T 
Sbjct: 361 NDLLGHPKTKAFVTHGGANGVYEAIYHGIPMIGIPLFGEQHDNIAHMVAKGAAVALNIRT 420 

Query: 420 MTSEDLLRALRTVITDSSYKENAMRLSRIHHDQPVKPLDRAVFWIEFVMRHKGAKHLRSA 479 

M+ D+L AL VI + YK+NA+ LS IHHDQP+KPLDRAVFWIEFVMRHK AKHLR 
Sbjct: 421 MSKSDVLNALEEVIENPFYKKNAIWLSTIHHDQPMKPLDRAVFWIEFVMRHKRAKHLRPL 480 

Query: 480 AHDLTWFQHYSIDVIGFLLTCVATAIFLFTKCFLFSCQKFNKTRKIEKRE 529 

H+LTW+Q++S+DVIGFLL+CVAT ILKCLF+FK KE 
Sbjct: 481 GHNLTWYQYHSLDVIGFLLSCVATTIVLSVKCLLFIYRFFVKKENKMKNE 530 



>gl5929692 RIKEN cDNA 9430041C03 
Length = 530 

Score = 742 bits (1894), Expect = 0.0 

Identities = 354/530 (66%), Positives = 418/530 (78%), Gaps = 1/530 (0%) 

Query: 1 MSMKWTSALLLIQLSCYFSSGSCGKVLWPTEFSHWMNIKTILDELVQRGHEVTVLASSA 60 

M KW SALLL+Q+SC F S CGKVLVWP EFSHWMNIK ILDELVQRGHEVTVL SA 
Sbjct: 1 MPGKWI S ALLLLQI SCCFRSVKCGKVLVWPMEFSHWMNIKI I LDELVQRGHEVTVLRPS A 60 

Query: 61 SISFDPNSPSTLKFEVYPVSLTKTEFEDIIKQLVKRWA-ELPKDTFWSYFSQVQEIMWTF 119 

DP LKFE +P S++K E+ + V W E+P+DT SY +Q ++ F 

Sbjct : 61 YYVLDPKKSPGLKFETFPTSVSKDNLENFFIKFVDVWTYEMPRDTCLSYSPLLQNMIDEF 120 

Query: 120 NDILRKFCKDIVSNKKLMKKLQESRFDWLADAVFPFGELLAELLKIPFVYSLRFSPGYA 179 

+D CKD+VSNK+LM KLQES+FDV+L+D V GEL+AELL+IPF+YS+RFSPGY 

Sbjct: 121 SDYFLSLCKDWSNKELMTKLQESKFDVLLSDPVASCGELIAELLQIPFLYSIRFSPGYQ 180 

Query: 180 IEKHSGGLLFPPSWPVVMSELSDQMTFIERVKNMIYVLYFEFWFQIFDMKKWDQFYSEV 239 

IEK SG L PPSYVPV++S L QMTF I E RVKNM I LYF+FWFQ+F+ KKWD FYSE 
Sbjct: 181 IEKSSGRFLLPPSYVPVILSGLGGQMTFIERVKNMICRLYFDFWFQMFNDKKWDSFYSEY 240 

Query: 240 LGRPTTLSETMAKADIWLIRNYWDFQFPHPLLPNVEFVGGLHCKPAKPLPKEMEEFVQSS 299 

LGRPTTL+ETM KA++WLIR+ WD +FPHP LPNV++VGGLHCKPAKPLPK+MEEFVQSS 
Sbjct: 241 LGRPTTLAETMGKAEMWLIRSNWDLEFPHPTLPNVDYVGGLHCKPAKPLPKDMEEFVQSS 300 

Query: 300 GENGVWFSLGSMVSNTSEERANVI ASALAKI PQKVLWRFDGNKPDTLGLNTRLYKWI PQ 3 59 

G++GVWFSLGSMVSN +EE+AN I A ALA+I PQKVLW+FDG P TLG NTR+YKW+PQ 
Sbjct: 301 GDHGVWFSLGSWSNMTEEKANTIAWTVLAQIPQKVI^WKFDGKTPATLGHNTRVYKWLPQ 360 

Query: 3 60 NDLLGH PKTKAF ITHGGMNG I YEAI YHGVPMVGVPI FGDQLDNI AHMKAKGAAVE INFKT 419 

NDLLGHPKTKAF+THGG NG+YE IYHG+PM+G+P+FG+Q DNIAHM AKGAAV +N +T 
Sbjct: 361 NDLLGH PKTKAF VTHGGANGVYEVIYHG I PMIG I PLFGEQHDNIAHMVAKGAAVTLNIRT 420 

Query: 420 MTSEDLLRALRTVITDSSYKENAMRLSRIHHDQPVKPLDRAVFWIEFVMRHKGAKHLRSA 479 

M+ D+L AL VI + YK+NA+ LS IHHDQP KPLDRAVFW+ EFVMRHK AKHLRS 
Sbjct: 421 MSRSDVLNALEEVIDNPFYKKNAIWLSTIHHDQPTKPLDRAVFWVEFVMRHKRAKHLRSL 480 

Query: 480 AHDLTWFQHYSIDVIGFLLTCVATAIFLFTKCFLFSCQKFNKTRKIEKRE 529 

H+LTW Q++ +DVIGFLL+CVA IL KCLF +FK K KE 
Sbjct: 481 GHNLTWHQYHFLDVIGFLLSCVAVTIVLTVKCLLFIYRFFVKKEKKIKNE 530 



Database : gbl 3 7 rodp 

Posted date: Sep 11, 2003 11:24 AM 
Number of letters in database: 25,169,402 
Number of sequences in database: 74,095 

Lambda K H 

0.324 0.138 0.428 
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Gapped 
Lambda 
0.270 



K H 
0.0470 



0.230 



Matrix: BLOSUM62 

Gap Penalties: Existence: 11, Extension: 1 

Number of Hits to DB: 27352210 

Number of Sequences: 74095 

Number of extensions: 1162215 

Number of successful extensions: 2617 

Number of sequences better than 10.0: 77 

Number of HSP's better than 10.0 without gapping: 72 

Number of HSP's successfully gapped in prelim test: 5 

Number of HSP's that attempted gapping in prelim test: 2394 

Number of HSP's gapped (non-prelim): 84 

length of query: 529 

length of database: 25,169,402 

effective HSP length: 49 

effective length of query: 480 

effective length of database: 21,538,747 

effective search space: 10338598560 

effective search space used: 10338598560 

T: 11 



A: 
XI 
X2 
X3 
SI 



40 
15 
38 
64 
41 



( 7.0 bits) 
(14.8 bits) 
(24.9 bits) 
(22 .0 bits) 



Submit sequences to: 



BLAST2 



M Killiifc ;} Reset 



IncyteGenomfcs 
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Exhibit c 

with Response dated 4/26/04 
In USSN: 09/980,729 
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Hi: AAA42313 . UDP-glucuronosylt...[gi:207581] 



BLink, Domains, Links 



LOCUS 

DEFINITION 

ACCESSION 

VERSION 

DBSOURCE 

KEYWORDS 

SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 

JOURNAL 
MEDLINE 
PUBMED 
COMMENT 
FEATURES 

source 



linear ROD 27-APR-1993 



Protein 
sigjeptide 
mat_peptide 
CDS 



AAA42313 529 aa 

UDP-glucuronosyltransf erase (EC 2.4.1.17). 
AAA42313 

AAA42313.1 GI: 207581 

locus RATUDPGTP accession M13506.1 

Rattus norvegicus (Norway rat) 
Rattus norvegicus 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Rodent ia; Sciurognathi ; Muridae; Murinae; 
Rattus . 

1 ( residues 1 to 529 ) 
Mackenzie , P . I . 

Rat liver UDP-glucuronosyltransf erase . Sequence and expression of 

cDNA encoding a phenobarbital- inducible form 

J. Biol. Chem. 261 (13), 6119-6125 (1986) 

86196018 

3084479 

Method: conceptual translation. 

Location/Qualifiers 
1. .529 

/organism^ "Rattus norvegicus" 
/db_xref = 11 taxon : 10116 " 
1. .529 

/name=" UDP-glucuronosyltransf erase (EC 2.4.1.17)" 
1. .24 

/note="UDP-glucuronosyltransf erase , signal peptide" 
25. . 529 

/product= " UDP-glucuronosyltransf erase " 
1. .529 

/coded_by="M13506.1:26. .1615" 



ORIGIN 



// 



1 msmkqtsvfl liqlicyfrp gacgkvlvwp teyshwinik iilnelaqrg hevtvlvssa 

61 silieptkes sinfeiysvp Isksdleysf akwidewtrd fetlsiwtyy skmqkvfney 

121 sdwenlcka liwnkslmkk lqgsqfdvil adavgpcgel laellktplv yslrfcpgyr 

181 cekfsgglpl ppsyvpwls elsdrmtfve rvknmlqmly fdfwfqpfke kswsqfysdv 

241 lgrpttltem mgkadiwlir tfwdlefphp flpnfdfvgg lhckpakplp remeefvqss 

301 gehgvwfsl gsmvknltee kanwasala qipqkwwrf dgkkpdtlgs ntrlykwipq 

3 61 ndllghpktk afvahggtng iyeaiyhgip ivgiplfadq pdninhmvak gaavrvdfsi 

421 Isttglltal kivmndpsyk enamrlsrih hdqpvkpldr avfwieyvmr hkgakhlrst 

481 Ihdlswfqyh sldvigflll cwgwfiit kfclfccrkt anmgkkkke 
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SeqServer 

biology in silico 



ChistalW Results 



Sequences I Help 



Retrieval I BLA5T2 I FA ST A I ClustalW I GCG Assembly I Phrap I Translation 
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D: 2912330CD1 
V g207581 



CLUSTAL W (1.7) Multiple Sequence Alignments 



Sequence format is Pearson 
Sequence 1: 2912330CD1 529 aa 

Sequence 2: g207581 529 aa 

Start of Pairwise alignments 
Aligning. . . 

Sequences (1:2) Aligned. Score: 68 
Start of Multiple Alignment 
There are 1 groups 
Aligning . . . 

Group 1: Sequences: 2 Score: 6064 

Alignment Score 2397 

CLUSTAL- Alignment file created [baa4vairG . aln] 
CLUSTAL W (1.7) multiple sequence alignment 



291233 0CD1 MSMKWTSALLLIQLSCYFSSGSCGKVLWPTEFSHWMNIKTILDELVQRGHEVTVLASSA 

g2 07581 MSMKQTSVFLLIQLICYFRPGACGKVLVWPTEYSHWINIKIILNELAQRGHEVTVLVSSA 
**** ** .***** *** *.**********.***.*** **.** ********* *** 

291233 0CD1 SISFDPNSPSTLKFEVYPVSLTKTEFEDIIKQLVKRWA-ELPKDTFWSYFSQVQEIMWTF 

g207581 SILIEPTKESSINFEIYSVPLSKSDLEYSFAKWIDEWTRDFETLSIWTYYSKMQKVFNEY 
** ..* *...**.***.*...* . . . *. .. ..*.*.*..*... 



2912330CD1 
g207581 



NDILRKFCKDIVSNKKLMKKLQESRFDWLADAVFPFGELLAELLKIPFVYSLRFSPGYA 

SDWENLCKALIWNKSLMKKLQGSQFDVILADAVGPCGELLAELLKTPLVYSLRFCPGYR 
*.. ..** .. ** ****** *.***.***** * ********* *.*★*★** *** 



29123 3 0CD1 I EKHSGGLLF PPS WPVVMSELSDQMTFI ERV1CNMI YVLYFEFWFQI FDMKKWDQF YSEV 

g207581 CEKFSGGLPLPPSWPVVXSELSDRMTFVERVKNMLQMLYFDFWFQPFKEKSWSQFYSDV 
** **** .********.*****.***.******. .***.***★ * * * ****.* 

291233 0CD1 LGRPTTLSETMAKADIWLIRNYWDFQFPHPLLPNVEFVGGLHCKPAKPLPKEMEEFVQSS 

g207581 LGRPTTLTEMMGKADIWLIRTFWDLEFPHPFLPNFDFVGGLHCKPAKPLPREMEEFVQSS 
*******.* * ******** .**..****.*** .**************.********* 



291233 0CD1 GENGVWFSLGSMVSNTSEERANVI AS ALAKI PQKVLWRFDGNKPDTLGLNTRLYKWI PQ 

g2 0 7 5 8 1 GEHGVWFSLGSMVKNLTEEKANWASALAQI PQKWWRFDGKKPDTLGSNTRLYKWI PQ 
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**.*********** * .**.***.*****.*****.*****.****** ********** 

2912330CD1 NDLLGHPKTKAFITHGGMNGIYEAIYHGVPMVGVPIFGDQLDNIAHMKAKGAAVEINFKT 
g207581 NDLLGHPKTKAFVAHGGTNGIYEAIYHGIPIVGIPLFADQPDNINHMVAKGAAVRVDFSI 
************..*** **********.*.**.*.*** *** ** ****** ••*. 

29123 30CD1 MTSEDLLRALRTVITDSSYKENAMRLSRIHHDQPVKPLDRAVFWIEFVMRHKGAKHLRSA 

g2 07 5 8 1 LSTTGLLTALKIVMNDPSYKENAMRLSRIHHDQPVKPLDRAVFWIEYVMRHKGAKHLRST 
... ** **. *. * *****************************.************. 

2912330CD1 AHDLTWFQHYSIDVIGFLLTCVATAIFLFTKCFLFSCQKFNKTRKIEKRE 

g207581 LHDLSWFQYHSLDVIGFLLLCWGWFIITKFCLFCCRKTANMGKK-KKE 
***.***..*.**★*★** ** .*..** ** *.* . * *.* 



Submit sequences to:| BLAST2 



IncyteGenornlcs 



2 of 2 



2/27/04 3:55 PM 



NCBI Sequence Viewer 



http://www.ncbi.nlm.nih.gov/entrez/viewer.fcgi?db=protein&db==prot.. 




r-Jpy ^ C y'J J; 

PMC Taxonomy 

Clipboard 



r 1: NP.033493 . UDP-glucuronosylt...[gi:6678501] 



BLink, Domains, Links 



LOCUS 

DEFINITION 
ACCESSION 
VERSION 
DBSOURCE 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 

JOURNAL 
PUBMED 
COMMENT 

FEATURES 

source 



ORIGIN 



NP_033493 530 aa linear ROD 24-DEC-2003 

UDP-glucuronosyltransf erase 2 family, member 5 [Mus musculus] . 
NP_033493 

NP_033493 .1 GI : 6678501 
REFSEQ: accession NM_009467 . 1 

Mus musculus (house mouse) 
Mus musculus 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus. 
1 (residues 1 to 530) 
Kimura , T . and Owens , I . S . 

Mouse UDP glucuronosyltransf erase . cDNA and complete amino acid 
sequence and regulation 

Eur. J. Biochem. 168 (3), 515-521 (1987) 
3117546 

PROVISIONAL REFSEQ : This record has not yet been subject to final 
NCBI review. The reference sequence was derived from X06358 . 1 . 
Location/Qualifiers 
1. .530 

/organism= "Mus musculus" 
/strain="C57Bl/6N" 
/ db_xr e f = " t axon : 1 0 0 9 0 " 
/chromosome=" 5" 
/map="5" 
1. .530 

/product="UDP-glucuronosyltransf erase 2 family, member 5" 
19 

/allele="R" 
/allele="S M 

/ db_xre f = " dbSNP : 8258200 " 
24. .528 

/region_name="UDP-glucoronosyl and UDP-glucosyl 
transferase" 
/note="UDPGT" 
/ db_x r e f = " C DD : 22944 " 
253 

/allele="Q" 
/allele="K" 

/ db_xr e f = " dbSNP : 8258202 " 
1..530 
/gene=" Ugt2b5" 

/coded_by="NM_009467 . 1 : 13 . . 1605" 

/note= " go_component : microsome [goid 0005792] [evidence 
IEA] ; 

go_component : integral to membrane [goid 0016021] 
[evidence IEA] ; 

go_f unction: transferase activity, transferring glycosyl 
groups [goid 0016757] [evidence IEA] ; 
go_f unction: glucuronosyltransf erase activity [goid 
0015020] [evidence IEA] ; 

go_f unction: transferase activity [goid 0016740] [evidence 
IEA] ; 

go_f unction: transferase activity, transferring hexosyl 
groups [goid 0016758] [evidence IEA] ; 

go_process: metabolism [goid 0008152] [evidence IEA] " 
/db_xref= "GenelD: 22238" 
/db_xref = n LocusID : 22238 " 
/ d b_x r e f = " MG 1 : 98900 " 

1 mpgkwisall llqisccfrs vkcgkvlvwp mefshwmnik iildelvqrg hevtvlrpsa 
61 yyvldpkksp glkfetfpts vskdnlenff ikfvdvwtye mprdtclsys pllqnmidef 



Protein 



variation 



Region 



variation 



CDS 
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121 sdyflslckd wsnkelmtk lqeskfdvll sdpvascgel iaellqipfl ysirfspgyq 

181 iekssgrfll ppsyvpvils glggqmtfie rikranicmly fdfwfqmfnd kkwdsfysey 

241 lgrpttlvet mgqaemwlir snwdlefphp tlpnvdyvgg lhckpakplp kdmeefvqss 

301 gdhgvwfsl gsmvsnmtee kanaiawala qipqkvlwkf dgktpatlgh ntrvykwlpq 

361 ndllghpktk afvthggang vyeaiyhgip migiplfgeq hdniahmvak gaavalnirt 

421 msksdvlnal eevienpfyk knaiwlstih hdqpmkpldr avfwiefvmr hkrakhlrpl 

481 ghnltwyqyh sldvigflls cvattivlsv kcllfiyrff vkkenkmkne 



// 
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□ 2912330CD1 

□ g6678501 



CLUSTAL W (1.7) Multiple Sequence Alignments 



Sequence format is Pearson 
Sequence 1: 2912330CD1 529 aa 

Sequence 2: g6678501 530 aa 

Start of Pairwise alignments 
Aligning . . . 

Sequences (1:2) Aligned. Score: 66 
Start of Multiple Alignment 
There are 1 groups 
Aligning . . . 

Group 1: Sequences: 2 Score: 5979 

Alignment Score 2326 

CLUSTAL- Alignment file created [baaoraiyA. aln] 
CLUSTAL W (1.7) multiple sequence alignment 



291233 0CD1 MSMKWTSALLLIQLSCYFSSGSCGKVLWPTEFSHWMNIKTILDELVQRGHEVTVLASSA 

g6678501 MPGKWISALLLLQISCCFRSVTCCGKVLWPMEFSHWMNIKIILDELVQRGHEVTVLRPSA 
* ** *****.*.** * * ******** ********* *************** ** 

291233 0CD1 SISFDPNSPSTLKFEVYPVSLTKTEFEDIIKQLVKRWA-ELPKDTFWSYFSQVQEIMWTF 

g6678501 YYVLDPKKSPGLKFETFPTSVSKDNLENFFIKFVDVWTYEMPRDTCLS YSPLLQNMIDEF 

.**. **** . * *..* ..* * *. *.*.** ** .*... * 

291233 0CD1 NDILRKFCKDIVSNKKLMKKLQESRFDWLADAVFPFGELLAELLKIPFVYSLRFSPGYA 
g6678501 SDYFLSLCKDWSNKELMTKLQESKFDVLLSDPVASCGELIAELLQIPFLYSIRFSPGYQ 
* . .**★.****.** *****.***.*.* * ***.****.***.**.****** 

2912330CD1 IEKHSGGLLFPPSWPVVMSELSDQMTFIERVKNMIYVLYFEFWFQIFDMKKWDQFYSEV 
g6678501 IEKSSGRFLLPPSYVPVILSGLGGQMTFIERIKNMICMLYFDFWFQMFNDKKWDSFYSEY 
*** ** .*.******★..* * *******.**** .***.****.*. ******** 

2 9 1 2 3 3 0CD1 LGRPTTLSETMAKADIWLIRNYWDFQFPHPLLPNVEFVGGLHCKPAKPLPKEMEEFVQSS 

g6678501 LGRPTTLVETMGQAEMWLIRSNWDLEF PHPTLPNVDYVGGLHCKPAKPLPKDMEEFVQSS 

******* *** .*..**** **..**** ****..**************.******** 



291233 0CD1 GENGVWFSLGSMVSNTSEERANVIAS ALAKI PQKVLWRFDGNKPDTLGLNTRLYKWI PQ 

g6678501 GDHGVWFSLGSMVSNMTEEKANAIAWALAQI PQKVLWKFDGKTPATLGHNTRVYKWLPQ 
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*..******★★★***★ .**.**** *********** : *** . ^* **★ *★*.***.*★ 

29123 3 0CD1 NDLLGHPKTKAFITHGGMNGIYEAIYHGVPMVGVPIFGDQLDNIAHMKAKGAAVEINFKT 

g6678501 NDLLGH PKTKAFVTHGG ANGVYEAI YHGI PMIGI PLFGEQHDNIAHMVAKGAAVALNIRT 

************.**** ★*.*******.**.★.*.**.* ****** ****** 

29123 30CD1 MTSEDLLRALRTVITDSSYKENAMRLSRIHHDQPVKPLDRAVFWIEFVMRHKGAKHLRSA 
g6678501 MSKSDVLNALEEVI ENPF YKKNAI WLSTI HHDQ PMKPLDRAVFWI EFVMRHKRAKHLRPL 

*:..*:*.**. ** :. ** : ** : ** ******.************★**** *****^ 

2912330CD1 AHDLTWFQHYSIDVIGFLLTCVATAIFLFTKCFLFSCQKFNKTRKIEKRE 

g6678501 GHNLTWYQYHSLDVIGFLLSCVATTIVLSVKCLLFIYRFFVKKENKMKNE 
*.***.*..*.*******.★***.★ * * * . * * . * * . * * 
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Entrez 



PubMed 



Nucleotide 



Protein 



Genoms 



Search | Protein 



fijiforL 

Limits 

liai8B[i.8Bj [default fl[ show: |~20~ 



Preview/Index 



^Protein 

Stricture PMC Taxonomy 

History Clipboard Details 

t 



Books 



r 1: CAA29657 . unnamed protein p...[gi:55120] 



BLink, Domains, Links 



LOCUS 

DEFINITION 
ACCESSION 
VERSION 
DBSOURCE 
KEYWORDS 
SOURCE 

ORGANISM 



CAA29657 530 aa 

unnamed protein product [Mus musculus] , 
CAA29657 

CAA29657.1 GI:55120 

embl locus MMUDPGT, accession X06358 . 1 



linear ROD 12-SEP-1993 



Craniata; Vertebrata; Euteleostomi ; 
Sciurognathi ; Muridae; Murinae; Mus. 



cDNA and complete amino acid 



Mus musculus (house mouse) 
Mus musculus 

Eukaryota; Metazoa; Chordata; 
Mammalia; Eutheria; Rodent ia; 
REFERENCE 1 (residues 1 to 530) 
AUTHORS Kimura , T . and Owens , I . S . 
TITLE Mouse UDP glucuronosyltransf erase 

sequence and regulation 
JOURNAL Eur. J. Biochem. 168 (3), 515-521 (1987) 
MEDLINE 88029469 
PUBMED 3117546 
COMMENT Data kindly reviewed ( 07-SEP-1988 ) by OWENS I.S. 

FEATURES Location/Qualifiers 
source 1 . . 530 

/organism="Mus musculus" 
/strain="C57BL/6N. " 
/db_xref =" taxon: 10090" 
/ c 1 one = " UDPGTm- 1 " 
/clone_lib= n lambda gtll" 

1..530 

/name=" unnamed protein product" 
1..530 

/coded_by="X06358 . 1 : 13 . . 1605" 

/note="UDP-glucuronosyltransf erase precursor (530 AA) " 
/db_xref="GOA:Pl7717" 
/db_xref=" Swiss - Pro t : £17717" 

1 mpgkwisall llqisccfrs vkcgkvlvwp mefshwmnik iildelvqrg hevtvlrpsa 

61 yyvldpkksp glkfetfpts vskdnlenff ikfvdvwtye mprdtclsys pllqnmidef 

121 sdyflslckd wsnkelmtk lqeskfdvll sdpvascgel iaellqipfl ysirfspgyq 

181 iekssgrfll ppsyvpvils glggqmtfie riknmicmly fdfwfqmfnd kkwdsfysey 

241 lgrpttlvet mgqaemwlir snwdlefphp tlpnvdyvgg Ihckpakplp kdmeefvqss 

3 01 gdhgvwfsl gsmvsnmtee kanaiawala qipqkvlwkf dgktpatlgh ntrvykwlpq 

361 ndllghpktk afvthggang vyeaiyhgip migiplfgeq hdniahmvak gaavalnirt 

421 msksdvlnal eevienpfyk knaiwlstih hdqpmkpldr avfwiefvmr hkrakhlrpl 

481 ghnltwyqyh sldvigflls cvattivlsv kcllfiyrff vkkenkmkne 

// 



Protein 



CDS 



ORIGIN 
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O 2912330CD1 
O g55120 



CLUSTAL W (1.7) Multiple Sequence Alignments 



Sequence format is Pearson 
Sequence 1: 2912330CD1 529 aa 

Sequence 2: g55120 530 aa 

Start of Pairwise alignments 
Aligning. . . 

Sequences (1:2) Aligned. Score: 66 
Start of Multiple Alignment 
There are 1 groups 
Aligning. . . 

Group 1: Sequences: 2 Score: 5979 

Alignment Score 2326 

CLUSTAL- Alignment file created [baasf aqUD . aln] 
CLUSTAL W (1.7) multiple sequence alignment 



2912330CD1 
g55120 



2912330CD1 
g55120 



MSMKWTSALLLIQLSCYFSSGSCGKVLWPTEFSHWMNIKTILDELVQRGHEVTVLASSA 

MPGKWISALLLLQISCCFRSVKCGKVLWPMEFSHWMNIKIILDELVQRGHEVTVLRPSA 
*^ ** *****.*.** * * ******** ********* *************** ** 

SISFDPNSPSTLKFEVYPVSLTKTEFEDIIKQLVKRWA-ELPKDTFWSYFSQVQEIMWTF 
YYVLDPKKSPGLKFETFPTSVSKDNLENFFIKFVDVWTYEMPRDTCLSYSPLLQNMIDEF 



.**. **** .* ★..* ..*, 



, * *. *.*.*★ ** .*... * 



2912330CD1 NDILRKFCKDIVSNKKLMKKLQESRFDWLADAVFPFGELLAELLKI PFVYSLRFS PGYA 

g55120 SDYFLSLCKDWSNKELMTKLQESKFDVLLSDPVASCGELIAELLQIPFLYSIRFSPGYQ 
* . .***.***★.** *****.★*★.*.* * ***.****.***.**.****** 

2912330CD1 IEKHSGGLLFPPSYVPWMSELSDQMTFIERVKl^IYVLYFEFWFQIFDMKKWDQFYSEV 

g55120 IEKSSGRFLLPPSYVPVILSGLGGQMTFIERIKNMICMLYFDFWFQMFNDKKWDSFYSEY 
*** ** .*.*★*****..* * *******.**** .***.****.*. **** **** 

291233 0CD1 LGRPTTLSETMAKADIWLIRNYWDFQFPHPLLPNVEFVGGLHCKPAKPLPKEMEEFVQSS 

g55120 LGRPTTLVETMGQAEMWLIRSNWDLEFPHPTLPNVDYVGGLHCKPAKPLPKDMEEFVQSS 
******* *** .*..**** **..**★* ****..**************.******** 



291233 0CD1 GENGVWFSLGSMVSNTSEERANVI ASALAKI PQKVLWRFDGNKPDTLGLNTRLYKWI PQ 

g5 5 1 2 0 GDHGVWFSLGSMVSNMTEEKANAIAWALAQI PQKVLWKFDGKTPATLGHNTRVYKWLPQ 
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*..************* .*★.**** ***.*******.***. * * * * ***.***.★★ 

2912330CD1 NDLLGHPKTKAFITHGGMNGIYEAIYHGVPMVGVPIFGDQLDNIAHMKAKGAAVEINFKT 

g5 5 12 0 NDLLGHPKTKAFVTHGGANGVYEAI YHGI PMIGI PLFGEQHDNIAHMVAKGAAVALNIRT 

************.**** **.*******.**.*.*.**.* ****** ****** .*..* 



2912330CD1 
g55120 



MTSEDLLRALRTVITDSSYKENAMRLSRIHHDQPVKPLDRAVFWIEFVMRHKGAKHLRSA 

MSKSDVLNALEEVIENPFYKKNAIWLSTIHHDQPMKPLDRAVFWIEFVMRHKRAKHLRPL 
*. *.* ** ** . **.**. ** ******.***************** ***** 



2912330CD1 AHDLTWFQHYSIDVIGFLLTCVATAIFLFTKCFLFSCQKFNKTRKIEKRE 

g5 5 1 2 0 GHNLTWYQYHSLDVIGFLLSCVATTIVLSVKCLLFI YRFFVKKENKMKNE 

*.***.*..*.****★**.*★**.* ★ **.*★ . * * . * * 
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Exhibit D 

with Response dated 4/26/04 
In USSN: 09/980,729 

• # #0 cJ^^JCiP O.J 3 c 



PubMed 




Search Protein 



Hi: P17717 . UDP-glucuronosylt...[gi: 136725] 



BLink, Domains, Links 



LOCUS 

DEFINITION 

ACCESSION 
VERSION 
DBSOURCE 



KEYWORDS 



SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 

JOURNAL 
MEDLINE 
PUBMED 
REMARK 

COMMENT 



P17717 530 aa linear ROD 16-OCT-2001 

UDP-glucuronosyl trans f erase 2B5 precursor, microsomal (UDPGT) 
(M-l) . 
P17717 

P17717 GI:136725 

swissprot: locus UDB5_M0USE, accession P17717; 
class: standard, 
created: Aug 1, 1990. 
sequence updated: Aug 1, 199 0. 
annotation updated: Oct 16, 2001. 
xrefs: gi : 55119 , gi : 55120 , gi : 90521 

xrefs { non- sequence databases) : MGI98900, InterProIPR002213 , 
Pf amPF00201 , PROSITEPS00375 

Transferase; Glycosyltransf erase ; Glycoprotein; Transmembrane; 
Signal; Multigene family; Microsome. 
Mus musculus (house mouse) 
Mus musculus 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae ; Mus. 
1 {residues 1 to 530) 
Kimura,T. and Owens, I. S. 

Mouse UDP glucuronosyltransf erase . cDNA and complete amino acid 
sequence and regulation 

Eur. J. Biochem. 168 (3), 515-521 (1987) 
88029469 
3117546 

SEQUENCE FROM N.A. 
STRAIN=C57BL/6N; TISSUE=Liver 

This SWISS-PROT entry is copyright. It is produced through a 
collaboration between the Swiss Institute of Bioinf ormatics and 
the EMBL outstation - the European Bioinf ormatics Institute. 
The original entry is available from http : //www. expasy . ch/sprot 
and http : / /www. ebi . ac . uk/sprot 

[FUNCTION] UDPGT IS OF MAJOR IMPORTANCE IN THE CONJUGATION AND 
SUBSEQUENT ELIMINATION OF POTENTIALLY TOXIC XENOBIOTICS AND 
ENDOGENOUS COMPOUNDS. 

[CATALYTIC ACTIVITY] UDP - G LUCURONATE + ACCEPTOR = UDP + ACCEPTOR 
BETA- D - GLUCURONOS I DE . 
[SUBCELLULAR LOCATION] MICROSOMAL. 

[SIMILARITY] BELONGS TO THE UDP-GLYCOSYLTRANSFERASE FAMILY. 
Location/Qualifiers 
1. .530 

/organism="Mus musculus" 
/db_xref=" taxon: 10090" 
1. .530 
/gene=" UGT2B5 " 
Protein 1. .530 

/gene=" UGT2B5 " 

/product= n UDP-glucuronosyl trans f erase 2B5 precursor, 

microsomal " 

/ EC_number = " 2.4.1.17 " 
Region 1 . . 23 

/gene="UGT2B5" 

/ region_name= " Signal " 

/note="BY SIMILARITY." 
Region 24 . . 530 

/gene="UGT2B5" 

/region_name=" Mature chain" 

/ note= " UDP- GLUCURONOS YLTRANSFERASE 2B5 . " 
Site 316 



FEATURES 

source 



gene 



lof2 
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ence Viewer 



http://www.ncbi.nlm.nih.gov/entrez/viewer.fcgi?db=protein&db=prot.. 



Site 



Region 



ORIGIN 



/gene="UGT2B5" 

/site_type= n glycosylation" 

/no te= 11 N- LINKED ( GLCNAC . . . ) (POTENTIAL) 

483 

/gene= ,, UGT2B5 M 

/site_type="glycosylation" 

/note= 11 N- LINKED (GLCNAC. . . ) (POTENTIAL) 

494. .510 

/gene= fl UGT2B5" 

/ region_name= n Transmembrane region " 
/note= " POTENTIAL . " 



1 mpgkwisall 
61 yyvldpkksp 
121 sdyflslckd 
181 iekssgrfll 
241 lgrpttlvet 
301 gdhgvwfsl 
361 ndllghpktk 
421 msksdvlnal 
481 ghnltwyqyh 



llqisccf rs 
glkf etfpts 
wsnkelmtk 
ppsyvpvils 
mgqaemwlir 
gsmvsnmtee 
afvthggang 
eevienpfyk 
sldvigf lis 



vkcgkvlvwp 
vskdnlenf f 
Iqeskfdvll 
glggqmtf ie 
snwdlefphp 
kanaiawala 
vyeaiyhgip 
knaiwlstih 
cvattivlsv 



// 



mef shwmnik 
ikfvdvwtye 
sdpvascgel 
riknmicmly 
tlpnvdyvgg 
qipqkvlwkf 
migiplfgeq 
hdqpmkpldr 
kcllfiyrff 



iildelvqrg 
mprdtclsys 
iaellqipf 1 
fdfwfqmf nd 
lhckpakplp 
dgktpatlgh 
hdniahmvak 
avfwiefvmr 
vkkenkmkne 



hevtvlrpsa 
pllqnmidef 
ysirf spgyq 
kkwdsfysey 
kdmeefvqss 
ntrvykwlpq 
gaavalnirt 
hkrakhlrpl 
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Sequences I Help 



Retrieval I BLAST 2 



ClustalW I GCG Assembly I Phrap I Translation 
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□ 2912330CD1 

□ gl36725 



CLUSTAL W (1.7) Multiple Sequence Alignments 



Sequence format is Pearson 
Sequence 1: 2912330CD1 529 aa 

Sequence 2: gl36725 530 aa 

Start of Pairwise alignments 
Aligning. . . 

Sequences (1:2) Aligned. Score: 66 
Start of Multiple Alignment 
There are 1 groups 
Aligning. . . 

Group 1: Sequences: 2 Score: 5979 

Alignment Score 2326 

CLUSTAL-Alignment file created [baac9ayTA. aln] 
CLUSTAL W (1.7) multiple sequence alignment 



291233 0CD1 MSMKWTSALLLIQLSCYFSSGSCGKVLWPTEFSHWMNIKTILDELVQRGHEVTVLASSA 

gl3 6725 MPGKWISALLLLQISCCFRSVKCGKVLWPMEFSHWMNIKIILDELVQRGHEVTVLRPSA 
* t ** *****.*.** * * ******** ********* *************** ** 

2912330CD1 SISFDPNSPSTLKFEVYPVSLTKTEFEDIIKQLVKRWA-ELPKDTFWSYFSQVQEIMWTF 
gl3 6725 YYVLDPKKSPGLKFETFPTSVSKDNLENFFIKFVDVWTYEMPRDTCLSYSPLLQNMIDEF 

. * * . ★*** .* *..* ..*... ;;*. * ; *.*•** *★ * 

291233 0CD1 NDILRKFCKDIVSNKKLMKKLQESRFDWLADAVFPFGELLAELLKI PFVYSLRFSPGYA 

gl 3 6 7 2 5 SDYFLSLCKDWSNKELMTKLQESKFDVLLSDPVASCGELIAELLQI PFLYS IRFS PGYQ 

* . .***.★**★.*******.**★.*.** ^ ***.****.***.**.****** 

2912330CD1 IEKHSGGLLFPPSWPVVMSELSDQMTFIERVKNMIYVLYFEFWFQIFDMKKWDQFYSEV 

gl36725 IEKSSGRFLLPPSYVPVILSGLGGQMTFIERIKNMICMLYFDFWFQMFNDKKWDSFYSEY 
*** ★★ .*.**★****..* * ^ *******.*** * .***.****.*. **** **** 

2912330CD1 LGRPTTLSETMAKADIWLIRNYWDFQFPHPLLPNVEFVGGLHCKPAKPLPKEMEEFVQSS 

gl36725 LGRPTTLVETMGQAEMWLIRSNWDLEFPHPTLPNVDYVGGLHCKPAKPLPKDMEEFVQSS 
******* *** .*..**** **..***★ ****..**************.******** 



291233 0CD1 GENG VWFSLGSMVSNTSEERANVI AS ALAKI PQKVLWRFDGNKPDTLGLNTRLYKWI PQ 

gl36725 GDHGVVVFSLGSMVSNMTEEKANAIAWALAQIPQKVLWKFIX3KTPATLGHNTRVYKWLPQ 
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*..************* . * * . * * ^ * * ***.*******.***. ^* * * * ***.***.** 

2912330CD1 NDLLGHPKTKAFITHGGMNGIYEAIYHGVPMVGVPIFGDQLDNIAHMKAKGAAVEINFKT 

gl36725 NDLLGHPKTKAFVTHGGANGVYEAIYHGIPMIGIPLFGEQHDNIAHMVAKGAAVALNIRT 
************.**** **.*******.*★.*.★.**.* ****** ****** .*..* 

2912330CD1 MTSEDLLRALRTVITDS S YKENAMRLS RI HHDQ PVKPLDRAVFWI EF VMRHKGAKHLRS A 

gl36725 MSKSDVLNALEEVIENPFYKKNAIWLSTIHHDQPMKPLDRAVFWIEFVMRHKRAKHLRPL 
*. *.* ** ** . **.**. ** ******.***************** ***** 



2912330CD1 AHDLTWFQHYSIDVIGFLLTCVATAIFLFTKCFLFSCQKFNKTRKIEKRE 

gl36725 GHNLTWYQYHSLDVIGFLLSCVATTIVLSVKCLLFIYRFFVKKENKMKNE 
*.***.*..*.*******.****.* * **.★* . * * . * * 
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□ 


2912330CD1 


n 


gl36729 


c 


g31377618 


o 


g29789078 


□ 


gl3487900 


o 


gl0863941 


□ 


g8850236 


o 


g6005930 


p. 


g4507815 


o 


g41282213 


□ 


gl36731 



CLUSTAL W (1.7) Multiple Sequence Alignments 



Sequence format 


is Pearson 




Sequence 1 


2912 


330CD1 


529 


aa 


Sequence 2 


gl36 


729 


533 


aa 


Sequence 3 


g313 


77618 


530 


aa 


Sequence 4 


g297 


89078 


530 


aa 


Sequence 5 


gl34 


87900 


534 


aa 


Sequence 6 


gl08 


63941 


528 


aa 


Sequence 7 


g885 


0236 


533 


aa 


Sequence 8 


g60C 


5930 


534 


aa 


Sequence 9 


g45C 


7815 


531 


aa 


Sequence 10: g41 


282213 


530 


aa 


Sequence 11: gl3 


6731 


534 


aa 


Start of Pairwise alignments 




Aligning. . 










Sequences 


(1:2) 


Aligned . 


Score ■ 


41 


Sequences 


(1:3) 


Aligned . 


Score : 


41 


Sequences 


(1:4) 


Aligned. 


Score : 


41 


Sequences 


(1:5) 


Aligned . 


Score : 


41 


Sequences 


(1:6) 


Aligned. 


Score : 


90 


Sequences 


(1:7) 


Aligned . 


Score : 


41 


Sequences 


(1:8) 


Aligned. 


Score : 


41 


Sequences 


(1:9) 


Aligned . 


Score : 


38 


Sequences 


(1:10) 


Aligned 


. Score: 


4 


Sequences 


(1:11) 


Aligned 


. Score: 


4 


Sequences 


(2:3) 


Aligned. 


Score : 


65 
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Sequences 
Sequences 
Sequences 
Sequences 
Sequences 
Sequences 
Sequences 
Sequences 
Sequences 
Sequences 
Sequences 
Sequences 
Sequences 
Sequences 
Sequences 
Sequences 
Sequences 
Sequences 
Sequences 
Sequences 
Sequences 
Sequences 
Sequences 
Sequences 
Sequences 
Sequences 
Sequences 
Sequences 
Sequences 
Sequences 
Sequences 
Sequences 
Sequences 
Sequences 
Sequences 
Sequences 
Sequences 
Sequences 
Sequences 
Sequences 
Sequences 
Sequences 
Sequences 
Sequences 
Guide tree 



2:4) Aligned. 
2:5) Aligned. 
Aligned . 
Aligned . 
Aligned . 
Aligned . 



Score : 
Score : 
Score : 
Score : 
Score : 
Score : 



2:6) 
2:7) 
2:8) 
2:9) 

2:10) Aligned. Score: 
2:11) Aligned. Score: 
3:4) Aligned . 
Aligned . 
Aligned . 
Aligned . 
Aligned . 
Aligned . 



Score : 
Score : 
Score : 
Score : 
Score : 
Score : 



(3:5) 
(3:6) 
(3:7) 
(3:8) 
(3:9) 

(3:10) Aligned. Score: 
(3:11) Aligned. Score: 
(4:5) Aligned. Score: 
4:6) Aligned. Score: 
4:7) Aligned. Score: 
4:8) Aligned. Score: 
4:9) Aligned. Score: 
(4:10) Aligned. Score: 
(4:11) Aligned. Score: 
(5:6) Aligned. Score: 
(5:7) Aligned. Score: 
(5:8) Aligned. Score: 
(5:9) Aligned. Score: 
(5:10) Aligned. Score: 
(5:11) Aligned. Score: 
(6:7) Aligned. Score: 
(6:8) Aligned. Score: 
(6:9) Aligned. Score: 
(6:10) Aligned. Score: 
6:11) Aligned. Score: 
7:8) Aligned. Score: 
7:9) Aligned. Score: 
7:10) Aligned. Score: 
(7:11) Aligned. Score: 
(8:9) Aligned. Score: 
(8:10) Aligned. Score: 
(8:11) Aligned. Score: 
(9:10) Aligned. Score: 
(9:11) Aligned. Score: 
(10:11) Aligned. Score 
file created: 
Start of Multiple Alignment 
There are 10 groups 
Aligning . . . 

Group 1 : Sequences : 2 
Group 2 : Sequences : 3 
Group 3 : Sequences : 4 
Group 4 : Sequences : 2 
Group 5 : Sequences : 2 
Group 6 : Sequences : 3 
Group 7 : Sequences : 5 
Group 8 : Sequences : 9 
Group 9 : Sequences : 2 
Group 10: Sequences: 11 
Alignment Score 113614 

C LUST AL -Alignment file created [baaBsaWDA. aln] 
CLUSTAL W (1.7) multiple sequence alignment 



66 
71 
42 
100 
71 
67 
65 
71 
94 
66 
42 
65 
66 
66 
94 
66 
66 
41 
66 
65 
66 
93 
65 
41 
71 
93 
66 
66 
93 
42 
42 
39 
41 
42 
71 
67 
65 
71 
66 
66 
100 
66 
66 
66 

[baaBsaWDA . dnd] 



Score: 8554 
Score: 8520 
Score: 7773 
Score: 8823 
Score: 8860 
Score: 8533 
Score: 8317 
Score: 7685 
Score: 8364 
Score: 3915 



g3 1377618 — MARTGWTS PI PLCVSLL.LTCG- FAEAGKLLWPMDGSHWFTMQSWEKLILRGHEVW 

g29789078 --MARAGOTSPVPLCVCLLLTCG-FAEAGKLLWPMDGSHWFTMQSWEKLILRGHEVW 

g4 1282213 - - MARAGWTG L L PL YVC LLLTCG - F AKAG KLLWPMDG S HWFTMQ SWEKL I LRG H EVW 

g4 5 07 815 -MACLLRSFQRISAGVFFLALWG-MWGDKLLWPQDGSHWLSMKDIVEVLSDRGHEIW 
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ClustalW Results 



http://patents.incyte.com: 8000/cgi-bin/SeqServer/SeqServ 



gl36729 MAVESQGGRP-LVLGLLLCVLGPWSHAGKILLIPVDGSHWLSMLGAIQQLQQRGHEIW 

g8 85023 6 MAVTSSQGGRP-LVLGLLLCVLGPVVSHAGKILLIPVDGSHWLSMLGAIQQLQQRGHEIVV 

g6005930 MARGLQVPLPRLATGLLLLLSVQPWAESGKVLWPTDGSPWLSMREALRELHARGHQAW 

gl3 6731 MARGLQVPLPRLATGLLLLLSVQPWAESGKVLWPTDGSPWLSMREALRELHARGHQAW 

gl3487900 MATGLQVPLPWLATGLLLLLSVQPWAESGKVLWPIDGSHWLSMREVLRELHARGHQAW 

291233 0CD1 --MSMKWTSALLLIQLSCYFSSG--S-CGKVLVWPTEFSHWMNIKTILDELVQRGHEVTV 

gl0863941 - -MSMKWTS ALLLIQLSCYFS SG- - S - CGKVLVWPTEFSHWMNIKTILDELVQRGHEVTV 



g3 1377618 VMPEVSWQLGKSL- -NC TVKTYSTSYTLEDLDREFMDFADAQWKA- -QVRSLFSLFLSSS 

g2 9789 078 VMPEVSWQLERSL--NCTVKTYSTSYTLEDQNREFMVFAHAQWKA--QAQSIFSLLMSSS 

g4 1282213 VMPEVSWQLGRSL — NCTVKTYSTS YTLEDQDREFMVFADARWTA- - PLRSAFSLLTSSS 

g4507815 WPEVNLLLKEYK — YYTRKIYPVPYDQEELKNRYQSFGNNHFAE--RSFLTAPQTEYRN 

gl3 6729 LAPDASLYIRDGA- -FYTLKTYPVPFQREDVKESFVSLGHNVFEN- -DSFLQRVIKTYKK 

g8 8 5 02 3 6 LAPDASLYIRDGA- -FYTLKTYPVPFQREDVKESFVSLGHNVFEN- -DSFLQRVIKTYKK 

g6005930 LTPEVNMHIKEEK- -FFTLTAYAVPWTQKEFDRVTLGYTQGFFET- - EHLLKRYSRSMAI 

gl3 6731 LTPEVNMHIKEEK- -FFTLTAYAVPWTQKEFDRVTLGYTQGFFET- -EHLLKRYSRSMAI 

gl3487 9 00 LTPEVNMHIKEEN--FFTLTTYAI SWTQDEFDRHVLGHTQLYFET--EHFLKKFFRSMAM 

291233 0CD1 LASSASISFDPNSPSTLKFEVYPVSLTKTEFEDIIKQLVKRWAELPKDTFWSYFSQVQEI 

gl 08 63941 LASSASISFDPNSPSTLKFEVYPVSLTKTEFEDIIKQLVKRWAELPKDTFWSYFSQVQEI 



g3 1377 618 NG-FFNLFFSHCRSLFNDRKLVEYLKESSFDAVFLDPFDACGLIVAKYFSLPSWFARGI 

g2 97 89 07 8 SG-FLDLFFSHCRSLFNDRKLVEYLKESSFDAVFLDPFDTCGLIVAKYFSLPSWFTRGI 

g41282213 NG-IFDLFFSNCRSLFNDRKLVEYLKESCFDAVFLDPFDACGLIVAKYFSLPSWFARGI 

g45 07815 NMIVIGLYFINCQSLLQDRDTLNFFKESKFDALFTDPALPCGVILAEYLGLPSVYLFRGF 

gl3 6729 IKKDSAMLLSGCSHLLHNKELMASLAESSFDVMLTDPFLPCSPIVAQYLSLPTVFFLHAL 

g885023 6 IKKDSAMLLSGCSHLLHNKELMASLAESSFDVMLTDPFLPCSPIVAQYLSLPTVFFLHAL 

g6005930 MNNVSLALHRCCVELLHNEALIRHLNATSFDWLTDPVNLCGAVLAKYLSI PAVFFWRYI 

gl 3 67 3 1 MNNVSLALHRCCVELLHNEALIRHLNATSFDWLTDPVNLCGAVLAKYLSI PAVFFWRYI 

gl 3487900 LNNMS LVYHRSCVELLHNEALIRHLNATSFDWLTDPVNLCAAVLAKYLS I PTVFFLRNI 

291233 0CD1 MWTFNDILRKFCKDIVSNKKLMKKLQESRFDWLADAVFPFGELLAELLKIPFVYSLRFS 

gl0863941 MWTFNDILRKFCKDIVSNKKLMKKLQESRFDWLADAVFPFGELLAELLKI PFVYRPRFS 

* : : :**.::*. : 

g3 1377 618 ACHYLEEGAQ-CPAPLSYVPRILLGFSDAMTFKERVRNHIMHLEEHLFCQYF-SKNALEI 

g2 9789 07 8 FCHHLEEGAQ-CPAPLSYVPNDLLGFSDAMTFKERVWNHIVHLEDHLFCQYL-FRNALEI 

g4 12822 13 FCHYLEEGAQ-CPAPLSYVPRLLLGFSDAMTFKERVWNHIMHLEEHLFCPYF-FKNVLEI 

g4507815 PCSLEHTFSR-SPDPVSYIPRCYTKFSDHMTFSQRVANFLVNLLEPYLFYCL-FSKYEKL 

gl3 67 29 PCSLEFEATQ-CPNPFSYVPRPLSSHSDHMTFLQRVKNMLIAFSQNFLCDVV-YSPYATL 

g8 8 5 02 3 6 PCSLEFEATQ-CPNPFSYVPRPLSSHSDHMTFLQRVKNMLIAFSQNFLCDW-YSPYATL 

g600593 0 PCDLDFKGTQ-CPNPSSYIPKLLTTNSDHMTFLQRVKNMLYPLALSYICHTF-SAPYASL 

gl3 6731 PCDLDFKGTQ-CPNPSSYIPKLLTTNSDHMTFLQRVKNMLYPLALSYICHTF-SAPYASL 

gl 3487900 PCDLDFKGTQ-C PNPSS YI PRLLTTNSDHMTFMQRVKNMLYPLALSYICHAF - S APYAS L 

291233 0CD1 PGYAIEKHSGGLLFPPSYVPWMSELSDQMTFIERVKNMIYVLYFEFWFQIFDMKKWDQF 

gl 0863941 PGYAIEKHSGGLLFPPSWPVVMSELSDQMTFIERVKNMIYVLYFEFWFQIFDMKKWDQF 

g3 1377 61 8 ASEILQTPVTAYDLYSHTSIWLLRTDFVLDYPKPVMPNMIFIGGINCHQGKPLPMEFEAY 

g2 9789078 ASEILQTPVTAYDLYSHTSIWLLRTDFVLDYPKPVMPNMIFIGGINCHQGKPLPMEFEAY 

g4 1282213 ASEILQTPVTAYDLYSHTSIWLLRTDFVLEYPKPVMPNMIFIGGINCHQGKPVPMEFEAY 

g4507815 ASAVLKRDVDIITL-SEVSVWLLRYDFVLEYPRPVMPNMVFIGGINCKKRKDLSQEFEAY 

gl3 6729 ASEFLQREVTVQDLLSSASVWLFRSDFVKDYPRPIMPNMVFVGGINCLHQNPLSQEFEAY 

g8850236 ASEFLQREVTVQDLLSSASVWLFRSDFVKDYPRPIMPNMVFVGGINCLHQNPLSQEFEAY 

g60 0593 0 ASELFQREVSWDLVSYASVWLFRGDFVMDYPRPIMPNMVFIGGINCANGKPLSQEFEAY 

gl3 6731 ASELFQREVSWDLVSYASVWLFRGDFVMDYPRPIMPNMVFIGGINCANGKPLSQEFEAY 

gl3487900 ASELFQREVS WDILSHASVWLFRGDFVMDYPRPIMPNMVFIGGINCANRKPLSQEFEAY 

291233 0CD1 YSEVLGRPTTLSETMAKADIWLIRNYWDFQFPHPLLPNVEFVGGLHCKPAKPLPKEMEEF 

gl0863941 YSEVLGRPTTLSETMAKADIWLIRNYWDFQFPHPLLPNVEFVGGLHCKPAKPLPKEMEEF 



g31377618 

g29789078 

g41282213 

g4507815 

gl36729 

g8850236 



INASGEHGIWFSLGSMVSEI 
INASGEHGIWFSLGSMVSEI 
INASGEHGIWFSLGSMVSEI 
INASGEHGIWFSLGSMVSEI 
INASGEHGIWFSLGSMVSEI 
INASGEHGIWFSLGSMVSEI 



PEKKAMAI ADALGKI PQTVLWRYTGTRPSNLANNT I LVK 
PEKKAMAI ADALGKI PQTVLWRYTGTRPSNLANNTILVK 
PEKKAMAI ADALGKI PQTVLWRYTGTRPSNLANNTILVK 
PEKKAMAI AD ALGKNPQTVLWRYTGTRPSNLANNTILVK 
PEKKAMAI ADALGKI PQTVLWRYTGTRPSNLANNTILVK 
PEKKAMAI ADALGKI PQTVLWRYTGTRPSNLANNTILVK 
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g6 005930 INASGEHGIVWSLGSWSEIPEKKAMAIADALGKIP 

gl3 6731 INASGEHGIVWSLGSMVSEIPEKKAMAIADALGKIPQWLWRYTGTRPSNLANNTILW 

gl34879 00 INASGEHGI WF SLG SMVS E I PEKKAMAI ADALGK I PQTVLWRYTGTRPSNLANNTI LVK 

29123 3 0CD1 VQSSGENGVWFSLGSMVSNTS EERANVI ASALAKI PQKVLWRFDGNKPDTLGLNTRLYK 

gl 08 639 4 1 VQSSGENGVVWSLGSMVSNTSEERAWIASALAKIPQKVLWRFDGNKPDTLGLNTRLYK 
...***.*.*****★****. * ★ * * * * * * * * * . ★ . * * * * * * 



g31377618 

g29789078 

g41282213 

g4507815 

gl36729 

g8850236 

g6005930 

gl36731 

gl3487900 

2912330CD1 

gl0863941 



g31377618 

g29789078 

g41282213 

g4507815 

gl36729 

g8850236 

g6005930 

gl36731 

gl3487900 

2912330CD1 

gl0863941 



WL PQNDLLGH PMTRAF ITHAGS HGVYES I CNGVPMVMMPLFGDQMDNAKRMETKGAGVTL 

WLPQNDLLGHPMTRAFITHAGSHGVYESICNGVPMVMMPLFGDQMDNAKRMETKGAGVTL 

WLPQNDLLGHPMTRAFITHAGSHGVYESICNGVPMVMMPLFGDQMDNAKRMETKGAGVTL 

WLPQNDLLGHPMTRAFITHAGSHGVYESICNGVPMVMMPLFGDQMDNAKRMETKGAGVTL 

WLPQNDLLGHPMTRAFITHAGSHGVYESICNGVPMVMMPLFGDQMDNAKRMETKGAGVTL 

WL PQNDLLGH PMTRAF ITHAGS HGVYES I CNGVPMVMM PL FGDQMDNAKRMETKGAGVTL 

WLPQNDLLGHPMTRAFITHAGSHGVYESICNGVPMVMMPLFGDQMDNAKRMETKGAGVTL 

WL PQNDL LGH PMTRAF I THAGS HGVYE S I CNGVPMVMM PL FGDQMDNAKRMETKGAGVTL 

WLPQNDLLGHPMTRAFITHAGSHGVYESICNGVPMVMMPLFGDQMDNAKRMETKGAGVTL 

WIPQNDLLGHPKTKAFITHGGMNGIYEAIYHGVPMVGVPIFGDQLDNIAHMKAKGAAVEI 

WI PQNDLLGH PKTRAF I THGGANG I YKAI S PRI PMVGVPLFADQPDNI AHMKAKGAAVSL 
*.********* *.****** .*.*..* .*** * * * ★ .*..*** * 

NVLEMTSEDLENALKAVINDKSYKENIMRLSSLHKDRPVEPLDLAVFWVEFVMRHKGAPH 
NVLEMTSEDLENALKAVINDKSYKENIMRLSSLHKDRPVEPLDLAVFWVEFVMRHKGAPH 
NVLEMTS EDLENALKAVINDKS YKENI MRL S S LHKDRPVE PLDLAVF WVE F VMRHKGAPH 
NVLEMTSEDLENALKAVINDKSYKENIMRLSSLHKDRPVE PLDLAVFWVEFVMRHKGAPH 
NVLEMTS EDLENALKAVINDKS YKENI MRLS S LHKDRPVE PLDLAVFWVEFVMRHKGAPH 
NVLEMTS EDLENALKAVINDKS YKENI MRLS S LHKDRPVE PLDLAVFWVEFVMRHKGAPH 
NVLEMTS EDLENALKAVINDKSYKENI MRL SSLHKDRPVE PLDLAVFWVEFVMRHKGAPH 
NVLEMTS EDLENALKAVINDKSYKENIMRLSSLHKDRPVE PLDLAVFWVEFVMRHKGAPH 
NVLEMTS EDLENALKAVINDKS YKENI MRL S SLHKDRPVE PLDLAVFWVEFVMRHKGAPH 
NFKTMTSEDLLRALRTVITDSSYKENAMRLSRIHHDQPVKPLDRAVFWIEFVMRHKGAKH 
DFHTMS STDLLNALKTVINDPL YKENAMKLS RI HHDQ PVK PLDRAVFWI EFVMRHKGAKH 



g3 1377618 LRPAAHDLTWYQYHSLDVIGFLLAWLTVAFITFKCCAYGYRKCLGKKGRVKKAHKSKTH 

g2 9789078 LRPAAHDLTWYQYHSLDVIGFLLAWLTVAFITFKCC AYGYRKCLGKKGRVKKAHKSKTH 

g4 1282213 LRPAAHDLTWYQYHSLDVIGFLLAWLTVAF I TFKCC AYGYRKCLGKKGRVKKAHKSKTH 

g4 5 0 7 8 1 5 LRPAAHDLTWYQYHSLDVIGFLLAWLTVAF I TFKCC PYGYPKCLGKKGRVKKAHKSKTH 

gl3 6729 LRPAAHDLTWYQYHSLDVIGFLLAVVXjTVAF I TFKCC AYGYRKCLGKKGRVKKAHKSKTH 

g8 85 023 6 LRPAAHDLTWYQYHSLDVIGFLLAWLTVAF I TFKCC AYGYRKCLGKKGRVKKAHKSKTH 

g6 005 93 0 LRPAAHDLTWYQYHSLDVIGFLLAWLTVAF I TFKCC AYGYRKCLGKKGRVKKAHKSKTH 

gl3 6731 LRPAAHDLTWYQYHSLDVIGFLLAWLTVAF I TFKCC AYGYRKCLGKKGRVKKAHKSKTH 

gl 3487900 LRPAAHDLTWYQYHSLDVIGFLLAWLTVAF I TFKCC AYGYRKCLGKKGRVKKAHKSKTH 

29123 3 0CD1 LRSAAHDLTWFQHYSIDVIGFLLTCVATAIFLFTKCFLFSCQKFNKTR-KIEKRE 

gl 08 63 941 LRVAAHDLTWFQ YHS LDVTGFLLAC VATVI F I ITKC - LFC VWKFVRTG - KKGKRD 
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The UDP gly cosy ltransf erase gene superfamily: recommended 
nomenclature update based on evolutionary divergence. 

Mackenzie PI, Owens IS, Burchell B, Bock KW, Bairoch A, Belanger A, 
Fournel-Gigleux S, Green M, Hum DW, Iyanagi T, Lancet D, Louisot P, 
Magdalou J, Chowdhury JR, Ritter JK, Schachter H, Tephly TR, Tipton KF, 
Nebert DW. 

Department of Clinical Pharmacology, Flinders University of South Australia, 
Bedford Park. 

This review represents an update of the nomenclature system for the UDP 
glucuronosyltransferase gene superfamily, which is based on divergent evolution. 
Since the previous review in 1991, sequences of many related UDP 
glycosyltransferases from lower organisms have appeared in the database, which 
expand our database considerably. At latest count, in animals, yeast, plants and 
bacteria there are 110 distinct cDNAs/genes whose protein products all contain a 
characteristic 'signature sequence' and, thus, are regarded as members of the same 
superfamily. Comparison of a relatedness tree of proteins leads to the definition of 33 
families. It should be emphasized that at least six cloned UDP-GlcNAc 
N-acetylglucosaminyltransferases are not sufficiently homologous to be included as 
members of this superfamily and may represent an example of convergent evolution. 
For naming each gene, it is recommended that the root symbol UGT for human (Ugt 
for mouse and Drosophila), denoting 'UDP glycosyltransferase,' be followed by an 
Arabic number representing the family, a letter designating the subfamily, and an 
Arabic numeral denoting the individual gene within the family or subfamily, e.g. 
'human UGT2B4' and 'mouse Ugt2b5\ We recommend the name 'UDP 
glycosyltransferase' because many of the proteins do not preferentially use UDP 
glucuronic acid, or their nucleotide sugar preference is unknown. Whereas the gene 
is italicized, the corresponding cDNA, transcript, protein and enzyme activity should 
be written with upper-case letters and without italics, e.g. 'human or mouse 
UGT1A1.' The UGT1 gene (spanning > 500 kb) contains at least 12 promoters/first 
exons, which can be spliced and joined with common exons 2 through 5, leading to 
different N-terminal halves but identical C-terminal halves of the gene products; in 
this scheme each first exon is regarded as a distinct gene (e.g. UGT1A1, UGT1A2, ... 
UGT1A12). When an orthologous gene between species cannot be identified with 
certainty, as occurs in the UGT2B subfamily, sequential naming of the genes is being 
carried out chronologically as they become characterized. We suggest that the 
Human Gene Nomenclature Guidelines 
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(http://www.gene.acl.ac. uk/nomenclature/guidelines.html++ +) be used for all 
species other than the mouse and Drosophila. Thirty published human UGT1 Al 
mutant alleles responsible for clinical hyperbilirubinemias are listed herein, and 
given numbers following an asterisk (e.g. UGT1A1*30) consistent with the Human 
Gene Nomenclature Guidelines. It is anticipated that this UGT gene nomenclature 
system will require updating on a regular basis. 

Publication Types: 

• Review 

• Review, Tutorial 

MeSH Terms: 

• Amino Acid Sequence 

• Animals 

• Evolution, Molecular* 

• Genes, Structural* 

• Glucuronosyltransferase/chemistry 

• Glucuronosyltransferase/genetics* 

• Human 

• Molecular Sequence Data 

• Multigene Family* 

• Sequence Alignment 

• Sequence Homology, Amino Acid 

• Support, U.S. Gov't, P.H.S. 

• Terminology* 

Substances: 

• Glucuronosyltransferase 
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VERSION 

DBSOURCE 



KEYWORDS 



SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 

JOURNAL 
MEDLINE 
PUBMED 
REMARK 

REFERENCE 
AUTHORS 

TITLE 



JOURNAL 
MEDLINE 
PUBMED 
REMARK 
REFERENCE 
AUTHORS 



TITLE 

JOURNAL 
MEDLINE 
PUBMED 
REMARK 
REFERENCE 



P22309 533 aa linear PRI 15-MAR-2004 

UDP-glucuronosyl trans f erase 1-1 precursor, microsomal 
(UDP-glucuronosyltransf erase 1A1) (UDPGT) (UGTl*l) (UGT1-01) 
(UGT1.1) (UGT-1A) (UGT1A) (Bilirubin specific UDPGT isozyme 1) 
(HUG-BR1) . 
P22309 

P22309 GI:136729 

swissprot: locus UDll__HUMAN, accession P22309; 
class: standard, 
created: Aug 1, 1991. 
sequence updated: Aug 1, 1991. 
annotation updated: Mar 15, 2004. 

xrefs: gi: 340131 , gi : 340132 , gi : 340129 , gi : 459838 , gi : 340127 , 
gi: 340128 , gi : 184472 , gi : 184473 , gi : 11118740 , gi : 11118749 , gi : 
5732165 , gi: 6094671 , gi : 3059176 , gi : 3059177 , gi : 87534 
xrefs (non-sequence databases): GenewHGNC : 12530 , MIM 191740 , MIM 
143500 , MIM 218800 , MIM 606785 , GO0006789, GO0008210, 
InterProIPR002213 , PfamPF00201, PROSITEPS00375 

Transferase; Glycosyltransf erase; Glycoprotein; Transmembrane; 
Signal; Multigene family; Microsome; Alternative splicing; Disease 
mutation. 

Homo sapiens (human) 
Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

1 (residues 1 to 533) 

Ritter,J.K., Crawford, J .M. and Owens , I . S . 

Cloning of two human liver bilirubin UDP-glucuronosyltransf erase 

cDNAs with expression in COS-1 cells 

J. Biol. Chem. 266 (2), 1043-1047 (1991) 

91093210 

1898728 

SEQUENCE FROM N . A . 
TISSUE=Liver 

2 (residues 1 to 533) 

Ritter,J.K., Chen,F., Sheen, Y.Y., Tran,H.M., Kimura,S., 
Yeatman,M.T. and Owens, I. S. 

A novel complex locus UGT1 encodes human bilirubin, phenol, and 
other UDP-glucuronosyltransf erase isozymes with identical carboxyl 
termini 

J. Biol. Chem. 267 (5), 3257-3261 (1992) 

92147680 

1339448 

SEQUENCE FROM N.A., AND TISSUE SPECIFICITY. 

3 (residues 1 to 533) 

Gong , Q . H . , Cho,J.W., Huang, T., Potter, C, Gholami,N., Basu,N.K., 
Kubota,S., Carvalho,S., Pennington, M.W. , Owens, I. S. and 
Popescu,N.C . 

Thirteen UDPglucuronosyl transferase genes are encoded at the human 

UGT1 gene complex locus 

Pharmacogenetics 11 (4), 357-368 (2001) 

21327373 

11434514 

SEQUENCE FROM N . A . 

4 (residues 1 to 533) 
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REMARK 
REFERENCE 
AUTHORS 



TITLE 



JOURNAL 
MEDLINE 
PUBMED 
REMARK 
REFERENCE 
AUTHORS 

TITLE 



JOURNAL 
MEDLINE 
PUBMED 
REMARK 
REFERENCE 
AUTHORS 
TITLE 

JOURNAL 
MEDLINE 
PUBMED 
REMARK 
REFERENCE 
AUTHORS 
TITLE 



JOURNAL 
MEDLINE 
PUBMED 
REMARK 
REFERENCE 
AUTHORS 



TITLE 

JOURNAL 
MEDLINE 
PUBMED 
REMARK 

REFERENCE 
AUTHORS 

TITLE 

JOURNAL 
MEDLINE 
PUBMED 



Sa toh , Y . , Ohkubo , I . and 



Huang, T.J. , Lahiri,P. , Elf erink, R . P . , 
Whitington, P. F . , Jansen,P.L. and 



Nakagawa , T . , Sasaoka , Y . , 



Gattung,S., Stoneking,T. and Davidson, T. 
Direct Submission 
Submitted (-MAR-1999) 
SEQUENCE FROM N. A . 

5 (residues 1 to 533) 

Ueyama,H., Koiwai,0., Soeda,Y., Sato,H., 
Doida,Y. 

Analysis of the promoter of human bilirubin 

UDP-glucuronosyltransf erase gene (UGT1*1) in relevance to Gilbert's 
syndrome 

Hepatol. Res. 9, 152-163 (1997) 
SEQUENCE OF 1-50 FROM N. A . 

6 (residues 1 to 533) 
Bosma,P.J., Chowdhury , J . R . 
Van Es,H.H., Lederstein, M . 
Chowdhury, N.R. 

Mechanisms of inherited deficiencies of multiple 
UDP-glucuronosyltransf erase isoforms in two patients with 
Crigler-Naj jar syndrome, type I 
FASEB J. 6 (10), 2859-2863 (1992) 
92339803 
1634050 

VARIANT CN-I PHE-375. 

7 (residues 1 to 533) 

Aono,S., Yamada,Y., Keino,H. , Hanada,N. 
Yazawa,T., Sato,H. and Koiwai,0. 

Identification of defect in the genes for bilirubin 
UDP-glucuronosyl- trans f erase in a patient with Crigler-Naj jar 
syndrome type I I 

Biochem. Biophys . Res. Commun. 197 (3), 1239-1244 (1993) 

94107323 

8280139 

VARIANTS CN-II ARG-71 AND ASP-486. 

8 (residues 1 to 533) 

Moghrabi,N., Clarke,D.J., Boxer, M. and Burchell,B. 

Identification of an A-to-G missense mutation in exon 2 of the UGT1 

gene complex that causes Crigler-Naj jar syndrome type 2 

Genomics 18 (1), 171-173 (1993) 

94102756 

8276413 

VARIANT CN-II ARG-331. 

9 (residues 1 to 533) 

Ritter,J.K., Yeatman,M . T . , Kaiser,C, Gridelli,B. and Owens, I. S. 
A phenylalanine codon deletion at the UGT1 gene complex locus of a 
Crigler-Naj jar type I patient generates a pH-sensitive bilirubin 
UDP-glucuronosyltransf erase 

J. Biol. Chem. 268 (31), 23573-23579 (1993) 

94043159 

8226884 

VARIANT CN-I PHE-170 DEL. 

10 (residues 1 to 533) 

Labrune , P . , Myara , A . , Hadchouel , M . , Ronchi , F . , Bernard , O . , 
Trivin,F., Chowdhury , N . R . , Chowdhury, J. R. , Munnich,A. and 
Odievre,M. 

Genetic heterogeneity of Crigler-Naj jar syndrome type I: a study of 
14 cases 

Hum. Genet. 94 (6), 693-697 (1994) 

95080780 

7989045 

VARIANTS CN-I VAL-292; GLU-308; ARG-357; THR-368; ARG-381; PRO-401 
AND GLU-428. 

11 (residues 1 to 533) 

Seppen,J., Bosma,P.J., Goldhoorn, B .G . , Bakker,C.T., Chowdhury , J . R. , 

Chowdhury, N. R. , Jansen,P.L. and Oude Elf erink, R. P . 

Discrimination between Crigler-Naj jar type I and II by expression 

of mutant bilirubin uridine diphosphate-glucuronosyl trans f erase 

J. Clin. Invest. 94 (6), 2385-2391 (1994) 

95081424 

7989595 
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NCBI Sequence Viewer 



http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?CMD=Display&DB=p.. 



REMARK VARIANTS CN GLU-175; ARG-177; TRP-209; ARG-276 AND PHE-375 . 
REFERENCE 12 (residues 1 to 533) 

AUTHORS Aono,S., Adachi,Y., Uyama,E., Yamada,Y., Keino,H., Nanno,T., 

Koiwai,0. and Sato,H. 
TITLE Analysis of genes for bilirubin UDP-glucuronosyl transferase in 

Gilbert ' s syndrome 
JOURNAL Lancet 345 (8955), 958-959 (1995) 
MEDLINE 95231122 
PUBMED 7715297 

REMARK VARIANTS GILBERT SYNDROME ARG-71; GLN-229 AND GLY-367. 

TISSUE=Liver, and Peripheral blood leukocytes 
REFERENCE 13 (residues 1 to 533) 

AUTHORS Yamamoto, K. , Soeda,Y., Kamisako,T., Hosaka,H., Fukano,M. , Sato,H., 

Fujiyama, Y., Adachi,Y., Satoh,Y. and Bamba,T. 
TITLE Analysis of bilirubin uridine 5 ' -diphosphate 

(UDP) -glucuronosyl transferase gene mutations in seven patients with 

Crigler-Naj jar syndrome type II 
JOURNAL J. Hum. Genet. 43 (2), 111-114 (1998) 
MEDLINE 98284535 

PUBMED 9621515 
REMARK VARIANTS CN-II ARG-71; TRP-209; GLN-229 AND ASP-486 . 
REFERENCE 14 (residues 1 to 533) 

AUTHORS Maruo,Y., Sato,H., Yamano,T., Doida,Y. and Shimada , M . 
TITLE Gilbert syndrome caused by a homozygous missense mutation 

(Tyr486Asp) of bilirubin UDP-glucuronosyl trans f erase gene 
JOURNAL J. Pediatr. 132 (6), 1045-1047 (1998) 
MEDLINE 98291073 

PUBMED 9627603 
REMARK VARIANT GILBERT SYNDROME ASP-486. 
COMMENT 

This SWISS-PROT entry is copyright. It is produced through a 

collaboration between the Swiss Institute of Bioinf ormatics and 

the EMBL outstation - the European Bioinf ormatics Institute. 

The original entry is available from http : / /www . expasy . ch/ sprot 

and http : / /www . ebi . ac .uk/sprot 



FEATURES 

source 



[FUNCTION] UDPGT is of major importance in the conjugation and 
subsequent elimination of potentially toxic xenobiotics and 
endogenous compounds. This isoform glucuronidates bilirubin 
IX-alpha to form both the IX-alpha-C8 and IX-alpha-C12 
monoconjugates and diconjugate. 

[CATALYTIC ACTIVITY] UDP-glucuronate + acceptor = UDP + acceptor 
beta-D-glucuronoside . 
[SUBCELLULAR LOCATION] Microsomal. 

[ALTERNATIVE PRODUCTS] Event=Alternative splicing; Named 
isoforms=l; Comment = A number of i so forms are produced. The 
different isozymes have a different N- terminal domain and a common 
C-terminal domain of 245 residues; Name=l; IsoId=P22309-l ; 
Sequence=Di splayed . 

[TISSUE SPECIFICITY] Expressed in liver. Not expressed in skin or 
kidney. 

[DISEASE] Defects in UGT1A1 are the cause of Gilbert syndrome 
[MIM: 143500 ] . Gilbert syndrome occurs as a consequence of reduced 
bilirubin transferase activity and is often detected in young 
adults with vague nonspecific complaints. 

[DISEASE] Defects in UGT1A1 are the cause of Crigler-Naj jar 
syndrome type I (CN-I) [MIM: 218800 ] . CN-I patients have severe 
hyperbilirubinemia and usually die of kernicterus (bilirubin 
accumulation in the basal ganglia and brainstem nuclei) within the 
first year of life. CN-I inheritance is autosomal recessive. 
[DISEASE] Defects in UGT1A1 are the cause of Crigler-Naj jar 
syndrome type II (CN-II) [MIM: 606785] . CN-II patients have less 
severe hyperbilirubinemia and usually survive into adulthood 
without neurologic damage. Phenobarbital , which induces the 
partially deficient glucuronyl transferase, can diminish the 
jaundice. CN-II inheritance is autosomal dominant. 
[SIMILARITY] Belongs to the UDP-glycosyl transferase family. 

Location/Qualifiers 

1. .533 
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http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?CMD=Display&DB=p... 



/organism= "Homo sapiens" 
/ db_xr e f = " t axon : 9 6 0 6 " 
1. .533 

/gene= H UGTlAl" 

/note= " synonyms : UGT1, GNT1 " 
1. .533 

/gene="UGTlAl" 

/product= "UDP-glucuronosyl transferase 1-1 precursor, 

microsomal " 

/ EC_numbe r = " 2.4.1.17 " 

1. .25 

/gene="UGTlAl M 
/region_name= " Signal " 
/note= " Potential . " 
26. .533 

/gene="UGTlAl " 

/region_name= "Mature chain" 

/note= " TJDP-GLUCURONOSYLTRANSF ERASE 1-1. " 

71 

/gene="UGTlAl " 
/region_name= "Variant " 

/note="G -> R (in CN-II and Gilbert syndrome) . 

/FTId=VAR_009504 . " 

102 

/gene="UGTlAl" 
/site_type= "glycosylation" 

/note="N-linked (GlcNAc. . . ) (Potential) . " 
170 

/gene="UGTlAl" 
/region_name= "Variant " 

/note= "Missing (in CN-I; has nearly normal activity at pH 

7.6 and is inactive at pH 6.4). /FTId=VAR_007 695 . " 

175 

/gene="UGTlAl " 
/region_name= "Variant " 

/note="L -> E (in CN-II). /FTId=VAR_007696 . " 
177 

/gene="UGTlAl" 
/region_name= "Variant " 

/note="C -> R (in CN-I). /FTId=VAR_007 697 . " 
209 

/gene="UGTlAl" 

/ r egi on_name= " Var i an t " 

/note="R -> W (in CN-II). /FTId=VAR_007 698 . " 
229 

/gene="UGTlAl " 

/ r egi on_name= " Var i ant " 

/note="P -> Q (in CN-II and Gilbert syndrome) . 

/FTId=VAR_009505 . " 

276 

/gene="UGTlAl" 

/ region_name= "Variant " 

/note="G -> R (in CN-I). /FTId=VAR_007699 . " 
292 

/gene="UGTlAl" 
/region_name= "Variant " 

/note="A -> V (in CN-I). /FTId=VAR_007700 . " 
295 

/gene="UGTlAl " 
/site_type= "glycosylation" 
/note="N-linked (GlcNAc. . . ) (Potential) . " 
308 

/gene="UGTlAl" 
/region_name= " Variant " 

/note="G -> E (in CN-I). /FTId=VAR_007701 . " 
331 

/gene="UGTlAl" 
/region_name= " Variant " 

/note="Q -> R (in CN-II). /FTId=VAR_007702 . " 
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Site 



Region 



Region 



Region 



Region 



Region 



Region 



Region 



Region 



Region 



347 

/gene="UGTlAl" 
/site_type="glycosylation" 
/note="N-linked (GlcNAc. . . ) (Potential) . " 
357 

/gene="UGTlAl" 
/region_name= "Variant " 

/note="Q -> R (in CN-I). /FTId=VAR_007703 . " 
367 

/gene="UGTlAl" 
/region_name= "Variant " 

/note="R -> G (in Gilbert syndrome). /FTId=VAR_0122 83 . " 
368 

/gene="UGTlAl" 
/region_name= "Variant " 

/note="A -> T (in CN-I) . /FTId=VAR_0077 04 . " 
375 

/gene= ,, UGTlAl" 
/region_name= "Variant " 

/note= B S -> F (in CN-I). /FTId=VAR_007705 . " 
381 

/gene= M UGTlAl" 
/region_name= " Variant " 

/note= n S -> R (in CN-I) . /FTId=VAR_007706 . " 
401 

/gene= M UGTlAl " 
/region_name= "Variant " 

/note="A -> P (in CN-I). /FTId=VAR_007707 . " 
428 

/gene= n UGTlAl" 
/region_name= "Variant " 

/note="K -> E (in CN-I). /FTId=VAR__007708 . " 
486 

/gene="UGTlAl " 
/region_name= "Variant " 

/note="Y -> D (in CN-II and Gilbert syndrome) . 
/FTId=VAR_007709. " 
491. .507 
/gene="UGTlAl" 

/region_name= "Transmembrane region" 
/note=" Potential . " 



ORIGIN 



// 



1 mavesqggrp 

61 apdaslyird 

121 amllsgcshl 

181 featqcpnpf 

241 evtvqdllss 

301 giwfslgsm 

3 61 lghpmtrafi 

421 edlenalkav 

481 ltwyqyhsld 



lvlglllcvl 
gafytlktyp 
Ihnkelmasl 
syvprplssh 
asvwlf rsdf 
vseipekkam 
thagshgvye 
indksykeni 
vigf llawl 



gpwshagki 
vpf qredvke 
aessf dvmlt 
sdhmtf lqrv 
vkdyprpimp 
aiadalgkip 
sicngvpmvm 
mrlsslhkdr 
tvaf itf kcc 



llipvdgshw 
sfvslghnvf 
dpf lpcspiv 
knmliaf sqn 
nmvfvgginc 
qtvlwrytgt 
mplf gdqmdn 
pvepldlavf 
aygyrkclgk 



lsmlgaiqql 
endsf lqrvi 
aqylslptvf 
f lcdwyspy 
lhqnplsqef 
rpsnlannti 
akrmetkgag 
wvefvmrhkg 
kgrvkkahks 



qqrgheiwl 
ktykkikkds 
f lhalpcsle 
atlasef Iqr 
eayinasgeh 
lvkwlpqndl 
vtlnvlemts 
aphlrpaahd 
kth 



l~2: NP_061949 . UDP glycosyltrans...[gi:31377618] 



BLink, Domains, Links 



LOCUS 

DEFINITION 

ACCESSION 

VERSION 

DBSOURCE 

KEYWORDS 

SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 



TITLE 



NP_061949 530 aa linear PRI 22-DEC-2003 

UDP glycosyl trans f erase 1 family, polypeptide A8 [Homo sapiens] . 
NP_061949 

NP_061949.3 GI: 31377618 
REFSEQ: accession NM__019076.3 

Homo sapiens (human) 
Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 
1 (residues 1 to 530) 

Gregory , P .A. , Gardner- S t ephen , D. A. , Lewinsky, R .H . , Dunclif f e , K .N . 
and Mackenzie, P . I . 

Cloning and characterization of the human 
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JOURNAL 
PUBMED 
REMARK 

REFERENCE 
AUTHORS 

TITLE 

JOURNAL 
PUBMED 
REFERENCE 
AUTHORS 



TITLE 

JOURNAL 
PUBMED 
REFERENCE 
AUTHORS 
TITLE 



■ JOURNAL 
PUBMED 
REFERENCE 
AUTHORS 



TITLE 

JOURNAL 
PUBMED 
COMMENT 



FEATURES 

source 



UDP-glucuronosyltransf erase 1A8, 1A9, and 1A10 gene promoters: 
differential regulation through an interior-like region 
J. Biol. Chem. 278 (38), 36107-36114 (2003) 
12847094 

GeneRIF: UGT1A8 , 1A9 , and 1A10 genes are differentially regulated 
through an initiator element in their 5 '-flanking regions 

2 (residues 1 to 530) 

Huang, Y.H., Gali jatovic , A . , Nguyen, N., Geske,D., Beaton ,D., 
Green, J., Green, M., Peters ,W.H. and Tukey , R . H . 
Identification and functional characterization of 
UDP-glucuronosyltransferases UGT1A8*1, UGT1A8*2 and UGT1A8*3 
Pharmacogenetics 12 (4), 287-297 (2002) 
12042666 

3 (residues 1 to 530 ) 

Gong, Q . H . , Cho,J.W., Huang, T., Potter,C, Gholami,N., Basu,N.K., 
Kubota,S., Carvalho,S., Pennington, M.W. , Owens , I. S. and 
Popescu, N.C . 

Thirteen UDPglucuronosyl transferase genes are encoded at the human 
UGT1 gene complex locus 

357-368 (2001) 



(4) 



Pharmacogenetics 11 
11434514 

4 (residues 1 to 530) 

Strassburg, C . P . , Manns, M. P. and Tukey , R . H . 

Expression of the UDP-glucuronosyltransf erase 1A locus in human 
colon. Identification and characterization of the novel 
extrahepatic UGT1A8 

J. Biol. Chem. 273 (15), 8719-8726 (1998) 
9535849 

5 (residues 1 to 530) 

Mackenzie, P. I . , Owens, I. S., Burchell,B., Bock,K.W. 
Belanger,A., Fournel-Gigleux, S . , Green, M., Hum,D.W 
Lancet, D. , Louisot,P., Magdalou,J., Chowdhury , J . R . 



Protein 



Bairoch, A. , 

Iyanagi , T . , 
Ritter, J.K. , 

Schachter,H. , Tephly,T.R., Tipton, K . F . and Nebert,D.W. 
The UDP glycosyltransf erase gene superfamily: recommended 
nomenclature update based on evolutionary divergence 
Pharmacogenetics 7 (4), 255-269 (1997) 
9295054 

PROVISIONAL REFSEQ : This record has not yet been subject to final 
NCBI review. The reference sequence was derived from AF462268 . 1 . 
On Jun 4, 2003 this sequence version replaced gi : 19424142 . 

Location/Qualifiers 

1. .530 

/organism^ "Homo sapiens" 
/db_xref = " taxon : 9 6 0 6 " 
/chromosome= "2 " 
/map=' , 2q37" 
1. .530 

/product= "UDP glycosyltransf erase 1 family, polypeptide 



variation 



Region 



variation 



variation 



variation 



A8" 
15 

/replace= 
/replace= 



■C" 
"S" 



/ db_xr e f = " dbSNP : 1126783 " 
26. .522 

/region_name= "UDP-glucoronosyl and UDP-glucosyl 

transferase" 

/note="UDPGT " 

/ db_xr e f = " CDD : 22944 " 

95 

/replace 
/replace 
/db_xref 
105 

/replace="L" 
/replace="M" 
/db_xref =" dbSNP 
109 

/replace= 
/replace= 



H" 
D" 

dbSNP: 1126785" 



1126788" 



•L" 
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dbSNP: 112 6792" 



variation 



variation 



variation 



variation 



variation 



variation 



CDS 



/db_xref 
110 

/replace 
/replace 
/db_xref 
118 

/replace 
/replace 
/db_xref 
208 

/replace 
/replace 
/db_xref 
212 

/replace 
/replace 
/db_xref 
216 

/replace 
/replace 
/db_xref 
508 

/replace 
/replace 

/db_xref 

1. .530 

/gene="UGTlA8" 

/coded_by="NM_019076 . 3:64.. 1656 " 

/note="go_f unction: UDP-glycosyltransf erase activity tgoid 
0008194] [evidence P] [pmid 9295054]; 

go_process: metabolism [goid 0008152] [evidence P] [pmid 
9295054] " 

/ db_xr e f = " Gene ID : 54576 " 
/ db_xre f = " Locus ID : 54576 " 
/ db_xr e f = "MIM: 606433 " 



ORIGIN 



"dbSNP: 112 6793" 



D" 
N" 

dbSNP: 1126798" 



"R" 
"W" 

" dbSNP: 

M M" 
"V" 

" dbSNP : 

"D" 
"E" 

" dbSNP: 



1126802" 



1126803" 



1126804" 



ii p ii 
"A" 

"dbSNP: 1042709" 



1 martgwtspi 

61 evswqlgksl 

121 f shcrslfnd 

181 aqcpap 1 syv 

241 aydlyshtsi 

301 vfslgsmvse 

3 61 pmtrafitha 

421 enalkavind 

481 yqyhsldvig 



plcvsllltc 
nctvktysts 
rklveylkes 
prillgf sda 
wllrtdfvld 
ipekkamaia 
gshgvyesic 
ksykenimrl 
f llawltva 



gf aeagkllv 
ytledldref 
sf davf ldpf 
mtf kervrnh 
ypkpvmpnmi 
dalgkipqtv 
ngvpmvmmpl 
sslhkdrpve 
f itf kccayg 



vpmdgshwf t 
mdf adaqwka 
dacglivaky 
imhleehlf c 
f igginchqg 
lwrytgtrps 
f gdqmdnakr 
pldlavfwve 
yrkclgkkgr 



mqsweklil 
qvrslf slf 1 
f slpswf ar 
qyf sknalei 
kplpmef eay 
nlanntilvk 
metkgagvtl 
f vmrhkgaph 
vkkahkskth 



rghewwmp 
sssngf f nlf 
giachyleeg 
aseilqtpvt 
inasgehgiv 
wlpqndllgh 
nvlemtsedl 
lrpaahdltw 



// 



T3: NP_061948 . UDP glycosyltrans...[gi:29789078] 



BLink, Domains, Links 



LOCUS NP_061948 530 aa linear PRI 21-DEC-2003 

DEFINITION UDP glycosyltransf erase 1 family, polypeptide A10; 

UDP-glucuronosyl transferase 1A10 [Homo sapiens] . 
ACCESSION NP_061948 

VERSION NP_061948.1 GI:29789078 

DBSOURCE REFSEQ: accession NM_019075.1 
KEYWORDS 

SOURCE Homo sapiens (human) 

ORGANISM Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 
REFERENCE 1 (residues 1 to 530) 

AUTHORS Gregory, P. A. , Gardner- Stephen, D. A. , Lewinsky , R. H. , Dunclif f e , K .N. 

and Mackenzie , P.I. 
TITLE Cloning and characterization of the human 

UDP-glucuronosyltransf erase 1A8, 1A9 , and 1A10 gene promoters: 
differential regulation through an interior-like region 
JOURNAL J. Biol. Chem. 278 (38), 36107-36114 (2003) 
PUBMED 12847094 

REMARK GeneRIF: UGT1A8 , 1A9 , and 1A10 genes are differentially regulated 
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REFERENCE 
AUTHORS 

TITLE 

JOURNAL 
PUBMED 
REMARK 



REFERENCE 
AUTHORS 



TITLE 

JOURNAL 
PUBMED 
REFERENCE 
AUTHORS 



TITLE 

JOURNAL 
PUBMED 
REFERENCE 
AUTHORS 
TITLE 



JOURNAL 
PUBMED 
REFERENCE 
AUTHORS 
TITLE 
JOURNAL 
PUBMED 
COMMENT 

FEATURES 



through an initiator element in their 5 '-flanking regions 

2 (residues 1 to 530} 

Elahi,A., Bendaly,J., Zheng, Z., Muscat , J. E. , Richie, J. P. Jr., 
Schantz,S.P. and Lazarus, P. 

Detection of UGT1A10 polymorphisms and their association with 
orolaryngeal carcinoma risk 
Cancer 98 (4), 872-880 (2003) 
12910533 

GeneRIF: the UGT1A10 gene has several low-frequency missense 
polymorphisms and that the codon 13 9 polymorphism is an independent 
risk factor for orolaryngeal carcinoma in blacks 

3 (residues 1 to 530) 

Gong, Q . H . , Cho, J.W. , Huang, T., Potter, C, Gholami,N., Basu,N.K., 
Kubota,S., Carvalho,S., Pennington, M .W . , Owens, I. S. and 
Popescu,N.C . 

Thirteen UDPglucuronosyl trans f erase genes are encoded at the human 
UGT1 gene complex locus 
Pharmacogenetics 11 (4), 357-368 (2001) 
11434514 

4 (residues 1 to 530) 

Mackenzie, P. I . , Owens, I. S., Burchell,B., Bock,K.W., Bairoch,A. , 
Belanger,A., Fournel-Gigleux, S . , Green, M. , Hum,D.W., Iyanagi , T . , 
Lancet, D., Louisot,P., Magdalou,J., Chowdhury, J.R. , Ritter,J.K., 
Schachter, H. , Tephly,T.R., Tipton, K.F. and Nebert,D.W. 
The UDP glycosyl transferase gene super family: recommended 
nomenclature update based on evolutionary divergence 
Pharmacogenetics 7 (4), 255-269 (1997) 
9295054 

5 (residues 1 to 530) 

Strassburg , C . P . , 01dhafer,K., Manns, M. P. and Tukey , R . H . 
Differential expression of the UGT1A locus in human liver, biliary, 
and gastric tissue: identification of UGT1A7 and UGT1A10 
transcripts in extrahepatic tissue 
Mol. Pharmacol. 52 (2), 212-220 (1997) 
9271343 

6 (residues 1 to 530) 

Jensen, L.B., Tallgren,A., Troest,T. and Jensen, S.B. 
Effect of acupuncture on myogenic headache 
Scand J Dent Res 85 (6), 456-470 (1977) 
271343 

PROVISIONAL REFSEQ : This record has not yet been subject to final 
NCBI review. The reference sequence was derived from BC020971 . 1 . 



Location/Qualifiers 
1. .530 

/ organi sm= " Homo sapi ens " 
/db_xref =" taxon:9606" 
/ chromo s ome = " 2 " 
/map= "2q37 " 



/product= "UDP glycosyltransf erase 1 family, polypeptide 
A10" 

/note= "UDP-glucuronosyltransf erase 1A10 " 
26. .522 

/region_name= "UDP-glucoronosyl and UDP-glucosyl 

transferase" 

/note= ,, UDPGT ,t 

/db_xref ="CDD: 22944" 



Protein 



1. .530 



variation 



139 



/replace="K" 
/replace="E" 



variation 



/ db_xr e f = " dbSNP : 10187694 " 
208 



/replace="R" 
/replace="W" 



variation 



/ db_xr e f = " dbSNP : 1126802 
508 



/replace="P" 
/replaced A" 



/ db_xr e f = " dbSNP : 1042709 
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NCBI Sequence Viewer 



http://www.ncbi.nlm.nih.gov/entrez/query .fcgi?CMD=Display&DB=p.. 



CDS 



ORIGIN 



1. .530 

/gene="UGTlA10" 

/coded_by="NM_019075.1:70. .1662" 

/note="go_f unction: UDP-glucuronosyl transferase [goid 
0003981] [evidence NR] " 
/ db_xr e f = " Gene ID : 54575 " 
/db_xref=" Locus ID: 54575 " 
/db_xref="MIM: 606435" 



1 maragwtspv 

61 evswqlersl 

121 fshcrslfnd 

181 aqcpaplsyv 

241 aydlyshtsi 

301 vfslgsmvse 

361 pmtrafitha 

421 ena 1 ka vi nd 

481 yqyhsldvig 



plcvcllltc 
nctvktysts 
rklveylkes 
pndllgf sda 
wllrtdfvld 
ipekkamaia 
gshgvyesic 
ksykenimrl 
f llawltva 



gf aeagkllv 
ytledqnref 
sf davf ldpf 
mtf kervwnh 
ypkpvmpnmi 
dalgkipqtv 
ngvpmvmmpl 
sslhkdrpve 
f itf kccayg 



vpmdgshwf t 
mvf ahaqwka 
dtcglivaky 
ivhledhlf c 
f igginchqg 
lwrytgtrps 
f gdqmdnakr 
pldlavfwve 
yrkclgkkgr 



mqsweklil 
qaqsif slim 
f slpswf tr 
qylf rnalei 
kplpmef eay 
nlanntilvk 
metkgagvtl 
fvmrhkgaph 
vkkahkskth 



rghewwmp 
ssssgf ldlf 
gif chhleeg 
aseilqtpvt 
inasgehgiv 
wlpqndllgh 
nvlemtsedl 
lrpaahdltw 



// 



T 4: NP_061966 . UDP glycosyltrans...[gi: 13487900] 



BLink Domains, Links 



LOCUS 

DEFINITION 

ACCESSION 

VERSION 

DBSOURCE 

KEYWORDS 

SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 

JOURNAL 
PUBMED 
REFERENCE 
AUTHORS 



NP_061966 534 aa linear PRI 25-JAN-2004 

UDP glycosyltransf erase 1 family, polypeptide A3 [Homo sapiens] . 
NP_061966 

NP_0619 66.1 GI: 13487900 
REFSEQ: accession NM_019093.2 



Homo 
Homo 



sapiens 
sapiens 



( human ) 



Craniata; Vertebrata; Euteleostomi ; 
Catarrhini; Hominidae; Homo. 



A Mechanism for Cell- and 



Eukaryota; Metazoa; Chordata; 
Mammalia; Eutheria; Primates; 

1 (residues 1 to 534) 
Zhang, T., Haws, P. and Wu,Q. 
Multiple Variable First Exons 
Tissue-Specific Gene Regulation 
Genome Res. 14 (1), 79-89 (2004) 
14672974 

2 (residues 1 to 534) 

Mackenzie, P. I . , Owens, I. S., Burchell,B., Bock,K.W., 
Belanger,A., Fournel-Gigleux, S . , Green, M. , Hum,D.W. 
Lancet ,D., Louisot,P., Magdalou,J., Chowdhury , J . R . , 
Schachter , H. , Tephly,T.R., Tipton, K.F. and Nebert , D 



Bairoch,A. , 

Iyanagi , T . , 
Ritter, J.K. , 



W. 

The UDP glycosyltransf erase gene superfamily: recommended 
nomenclature update based on evolutionary divergence 
Pharmacogenetics 7 (4), 255-269 (1997) 
9295054 

3 (residues 1 to 534) 

Ritter, J. K., Chen,F., Sheen, Y.Y., Tran,H.M., Kimura,S., 
Yeatman,M.T. and Owens, I. S. 

A novel complex locus UGTl encodes human bilirubin, phenol, and 
other UDP-glucuronosyl trans f erase isozymes with identical carboxyl 
termini 

J. Biol. Chem. 267 (5), 3257-3261 (1992) 
1339448 

PROVISIONAL REFSEQ : This record has not yet been subject to final 
NCBI review. The reference sequence was derived from AY435138 . 1 . 
Location/Qualifiers 
1. .534 

/organism="Homo sapiens" 
/db_xref="taxon: 9606" 
/chromosome= " 2 " 
/map="2q37" 
Protein 1. .534 

/product= "UDP glycosyltransf erase 1 family, polypeptide 
A3 " 

variation 11 

/replace="R" 
/replace="W" 



TITLE 

JOURNAL 
PUBMED 
REFERENCE 
AUTHORS 

TITLE 



JOURNAL 
PUBMED 
COMMENT 

FEATURES 

source 
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NCBI Sequence Viewer 



http://www.ncbi.nlm.nih.gov/entjez/query.fcgi?CMD=Display&DB=p.. 



Region 



Region 



variation 



variation 



CDS 



/ db_xr e f = '* dbSNP : 3821242 " 
24. .510 

/region_name="UDP-glucuronosyl and UDP-glucosyl 

transferase [Carbohydrate transport and metabolism, Energy 

production and conversion] M 

/note= n KOG1192" 

/ db_xr e f = " CDD : 18981 " 

29. .526 

/region_name="UDP-glucoronosyl and UDP-glucosyl 

transferase" 

/note="UDPGT " 

/ db_xr e f = " CDD : 24386 " 

47 

/replace= "A" 
/replace= ,, V ,, 

/ db_xr e f = " dbSNP : 6431625 " 
512 

/replace= ' 
/replace= ' 

/db_xref= 

1. .534 

/gene="UGTlA3 " 

/coded_bY= "NM_019093 .2:1.. 1605" 

/note= "go_component : microsome tgoid 0005792] [evidence 
IEA] ; 

go_component : integral to membrane [goid 0016021] 
[evidence IEA] ; 

go_f unction: UDP-glycosyl transferase activity [goid 
0008194] [evidence TAS] [pmid 9295054]; 
go_f unction: glucuronosyl trans f erase activity [goid 
0015020] [evidence IEA]; 

go_process: metabolism [goid 0008152] [evidence TAS] [pmid 
9295054] " 

/ db_xre f =" Gene ID : 54659 " 
/db_xref = " Locus ID : 54659 " 
/ db_xr e f = " MIM : 606428 " 



i p ir 

"A" 

"dbSNP: 1042709' 



ORIGIN 



// 



1 matglqvplp 

61 Itpevnmhik 

121 slvyhrscve 

181 dfkgtqcpnp 

241 revswdils 

301 hgiwfslgs 

3 61 llghpmtraf 

421 sedlenalka 

481 dltwyqyhsl 



wlatglllll 
eenf ftltty 
llhnealirh 
ssyiprlltt 
hasvwlf rgd 
mvseipekka 
ithagshgvy 
vindksyken 
dvigf llaw 



svqpwaesgk 
aiswtqdef d 
Inatsf dwl 
nsdhmt fmqr 
f vmdyprpim 
maiadalgki 
esicngvpmv 
imrlsslhkd 
ltvaf itf kc 



vlwpidgsh 
rhvlghtqly 
tdpvnlcaav 
vknmlyplal 
pnmvf iggin 
pqtvlwrytg 
mmplf gdqmd 
rpvepldlav 
caygyrkclg 



wlsmrevlre 
fetehflkkf 
lakylsiptv 
syichaf sap 
canrkplsqe 
trpsnlannt 
nakrmetkga 
f wvef vmrhk 
kkgrvkkahk 



lharghqaw 
f rsmamlnnm 
f f lrnipcdl 
yaslaself q 
f eayinasge 
ilvkwlpqnd 
gvtlnvlemt 
gaphlrpaah 
skth 



PS: NP_066962 . UDP glycosyltrans...[gi: 10863941] 



BLink, Domains. Links 



LOCUS 

DEFINITION 

ACCESSION 

VERSION 

DBSOURCE 

KEYWORDS 

SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 



TITLE 



JOURNAL 
PUBMED 



NP_066962 528 aa linear 

UDP glycosyltransf erase 2 family, polypeptide B4; 
UDP-glucuronyltransf erase, family 2, beta-4 [Homo sapiens]. 
NP_066962 

NP_066962 . 1 GI: 10863941 
REFSEQ: accession NM_021139 . 1 



PRI 21-DEC-2003 



Homo sapiens (human) 
Homo sapiens 

Eukaryota; Metazoa; Chordata; 
Mammalia; Eutheria; Primates; 
1 (residues 1 to 528) 
Barbier , 0 . , Du ran -Sandoval , D . 
Fruchart, J.C. and Staels,B. 

Peroxisome prolif erator-activated receptor alpha induces hepatic 

expression of the human bile acid glucuronidating 

UDP-glucuronosyltransf erase 2B4 enzyme 

J. Biol. Chem. 278 (35), 32852-32860 (2003) 

12810707 



Craniata; Vertebrata; Euteleostomi ; 
Catarrhini; Hominidae; Homo. 

Pineda-Torra , I . , Kosykh , V . , 
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NCBI Sequence Viewer 



http://www.ncbi.nlm.nih.gov/entrez/query .fcgi?CMD=Display&DB=p.. 



REMARK 
REFERENCE 
AUTHORS 



TITLE 

JOURNAL 
PUBMED 
REMARK 



REFERENCE 
AUTHORS 



TITLE 

JOURNAL 
PUBMED 
REFERENCE 
AUTHORS 

TITLE 

JOURNAL 
PUBMED 
REFERENCE 
AUTHORS 
TITLE 

JOURNAL 
PUBMED 
REFERENCE 
AUTHORS 
TITLE 



JOURNAL 
PUBMED 
REFERENCE 
AUTHORS 

TITLE 

JOURNAL 
PUBMED 
COMMENT 

FEATURES 



GeneRIF: UGT2B4 expression is regulated by PPARalpha 

2 (residues 1 to 528) 

Barbier,0., Torra,I.P., Sirvent,A., Claudel,T., Blanquart , C . , 
Duran-Sandoval,D. , Kuipers,F., Kosykh,V., Fruchart , J.C . and 
Staels , B . 

FXR induces the UGT2B4 enzyme in hepatocytes: a potential mechanism 
of negative feedback control of FXR activity 
Gastroenterology 124 (7), 1926-1940 (2003) 
12806625 

GeneRIF: Farnesoid X receptor (FXR) induces the UGT2B4 enzyme in 
hepatocytes; this study identifies UGT2B4 as a novel FXR target 
gene . 

3 (residues 1 to 528) 

Mackenzie, P. I . , Owens, I. S., Burchell,B., Bock,K.W., Bairoch,A., 
Belanger,A., Fournel-Gigleux, S . , Green, M. , Hum,D.W., Iyanagi,T., 
Lancet, D., Louisot,P., Magdalou,J., Chowdhury, J.R. , Ritter,J.K., 
Schachter, H. , Tephly,T.R., Tipton, K.F. and Nebert,D.W. 
The UDP glycosyltransf erase gene superfamily: recommended 
nomenclature update based on evolutionary divergence 
Pharmacogenetics 7 (4), 255-269 (1997) 
9295054 

4 (residues 1 to 528) 

Monaghan,G., Clarke, D. J., Povey,S., See,C.G., Boxer, M. and 
Burchell , B . 

Isolation of a human YAC contig encompassing a cluster of UGT2 
genes and its regional localization to chromosome 4ql3 
Genomics 23 (2), 496-499 (1994) 
7835904 

5 (residues 1 to 52 8) 

Jin,C.J., Miners, J. O., Lillywhite, K . J . and Mackenzie, P . I . 

cDNA cloning and expression of two new members of the human liver 

UDP-glucuronosyltransf erase 2B subfamily 

Biochem. Biophys . Res. Commun. 194 (1), 496-503 (1993) 

8333863 

6 (residues 1 to 528) 

Ritter,J.K., Chen,F., Sheen, Y.Y., Lubet,R.A. and Owens , I . S . 

Two human liver cDNAs encode UDP-glucuronosyltransf erases with 2 

log differences in activity toward parallel substrates including 

hyodeoxycholic acid and certain estrogen derivatives 

Biochemistry 31 (13), 3409-3414 (1992) 

1554722 

7 (residues 1 to 528) 

Jackson, M. R. , McCarthy , L . R. , Harding, D., Wilson, S., Coughtrie , M.W. 
and Burchell, B. 

Cloning of a human liver microsomal UDP-glucuronosyltransf erase 
cDNA 

Biochem. J. 242 (2), 581-588 (1987) 
3109396 

PROVISIONAL REFSEQ: This record has not yet been subject to final 
NCBI review. The reference sequence was derived from Y00317 . 1 . 



Location/Qualifiers 
1. .528 

/organism= " Homo sapiens" 
/db_xref = " taxon : 9606 " 
/ chromosome= " 4 " 
/map="4ql3" 
1. .528 

/product= "UDP glycosyltransf erase 2 family, polypeptide 
B4" 

/ EC_number = " 2.4.1.17 " 

/note= "UDP-glucuronyl trans f erase, family 2, beta-4" 



/product^ "mature UDPGT (AA 1-505) (EC 2.4.1.17)" 
24. .526 

/region_name="UDP-glucoronosyl and UDP-glucosyl 

transferase" 

/note=" UDPGT" 



sig_peptide 



1. .23 

/note= " putative " 
24. .528 



mat ^peptide 



11 of 23 



3/1/04 5:24 PM 



NCBI Sequence Viewer 



http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?CMD=Display&DB=p.. 



CDS 



/ db_xr e f = " CDD : 22944 " 
1. .528 

/gene="UGT2B4 " 

/coded_by="NM_021139.1:38. .1624" 

/note="go_component : microsome [goid 0005792] [evidence 
NAS ] ; 

go_component : integral to membrane [goid 0016021] 
[evidence IEA] ; 

go_f unction: glucuronosyl trans f erase activity [goid 
0015020] [evidence IEA]; 

go_process: xenobiotic metabolism [goid 0006805] [evidence 
IDA] [pmid 8333863] ; 

go_process: estrogen catabolism [goid 0006711] [evidence 
IDA] [pmid 8333863] " 
/ db_xr e f - " Gene ID : 7363 " 
/db_xref=" Locus ID: 7363 " 
/ db_xr e f = " MIM : 600067 " 



ORIGIN 



1 msmkwtsall 

61 sisfdpnsps 

121 dilrkfckdi 

181 ekhsggllfp 

241 grpttlsetm 

301 engvwfslg 

3 61 dllghpktra 

421 sstdllnalk 

481 hdltwfqyhs 



liqlscyf ss 
tlkf evypvs 
vsnkklmkkl 
psyvpwmse 
akadiwlirn 
smvsntseer 
f ithggangi 
tvindplyke 
Idvtgfllac 



gscgkvlvwp 
ltktefedii 
qesrf dwla 
lsdqmtf ier 
ywdf qfphpl 
anviasalak 
ykaispripm 
namklsrihh 
vatvif iitk 



tef shwmnik 
kqlvkrwael 
davfpf gell 
vknmiyvlyf 
Ipnvefvggl 
ipqkvlwrf d 
vgvplf adqp 
dqpvkpldra 
clfcvwkfvr 



tildelvqrg 
pkdtfwsyf s 
aellkipfvy 
efwf qif dmk 
hckpakplpk 
gnkpdtlgln 
dniahmkakg 
vfwiefvmrh 
tgkkgkrd 



hevtvlassa 
qyqeimwtf n 
rprf spgyai 
kwdqfysevl 
emeefvqssg 
trlykwipqn 
aavsldfhtm 
kgakhlrvaa 



// 



H6: NP_000454 . UDP glycosyltrans...[gi:8850236] 



BLink, Domains, Links 



LOCUS 

DEFINITION 

ACCESSION 

VERSION 

DBSOURCE 

KEYWORDS 

SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 



JOURNAL 
PUBMED 
REMARK 

REFERENCE 
AUTHORS 
TITLE 
JOURNAL 
PUBMED 
REMARK 

REFERENCE 
AUTHORS 

TITLE 



JOURNAL 
PUBMED 
REMARK 



linear PRI 20-DEC-2003 
polypeptide Al [Homo sapiens] . 



NP_000454 533 aa 

UDP glycosyl trans f erase 1 family, 
NP„000454 

NP_000454 .1 GI : 8850236 
REFSEQ: accession NM_000463 . 1 



Homo sapiens (human) 
Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Primates; Catarrhini ; Hominidae; Homo. 
1 (residues 1 to 533) 
Ohnishi,A. and Emi,Y. 

Rapid proteasomal degradation of translocation-def icient 
UDP-glucuronosyltransf erase lAl proteins in patients with 
Crigler-Naj jar type II 

Biochem. Biophys. Res. Commun. 310 (3), 735-741 (2003) 
14550264 

a translocation-def icient UGT1A1 protein is involved in 



GeneRIF : 
Crigler-Naj jar syndrome 

2 (residues 1 to 533) 
Innocenti,F. and Ratain,M.J. 

Irinotecan treatment in cancer patients with UGTlAl polymorphisms 
Oncology (Huntington, N.Y.) 17 (5 Suppl 5), 52-55 (2003) 
12800608 

GeneRIF: genetic variation in genes involoved in the disposition of 
anticancer agents might alter patient 't outcome. 

3 (residues 1 to 533) 

Kohle,C, Mohrle,B., Munzel,P.A., Schwab, M. , Wernet,D., Badary,O.A. 
and Bock,K.W. 

Frequent co-occurrence of the TATA box mutation associated with 
Gilbert's syndrome (UGT1A1*28) with other polymorphisms of the 
UDP-glucuronosyltransf erase-1 locus (UGT1A6*2 and UGT1A7*3) in 
Caucasians and Egyptians 

Biochem. Pharmacol. 65 (9), 1521-1527 (2003) 
12732365 

GeneRIF: Frequent haplotypes containing several UGT1 allelic 
variants should be taken into account in studies on the association 
between diseases, abnormal drug reactions, and UGT1 family 
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NCBI Sequence Viewer 



http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?CMD=Display&DB=p.. 



REFERENCE 
AUTHORS 
TITLE 

JOURNAL 
PUBMED 
REMARK 



REFERENCE 
AUTHORS 
TITLE 



JOURNAL 
PUBMED 
REMARK 



REFERENCE 
AUTHORS 
TITLE 

JOURNAL 
PUBMED 
REMARK 

REFERENCE 
AUTHORS 
TITLE 
JOURNAL 
PUBMED 
REMARK 



REFERENCE 
AUTHORS 
TITLE 



JOURNAL 
PUBMED 
REMARK 

REFERENCE 
AUTHORS 
TITLE 

JOURNAL 
PUBMED 
REMARK 



REFERENCE 
AUTHORS 

TITLE 

JOURNAL 
PUBMED 
REMARK 

REFERENCE 
AUTHORS 
TITLE 

JOURNAL 



polymorphisms . 

4 (residues 1 to 533) 

Heeney,M.M., Howard, T. A., Zimmerman, S .A. and Ware,R.E. 
UGT1A promoter polymorphisms influence bilirubin response to 
hydroxyurea therapy in sickle cell anemia 
J. Lab. Clin. Med. 141 (4), 279-282 (2003) 
12677174 

GeneRIF: The UGTlA promoter polymorphism is a powerful nonglobin 
genetic modifier in Sickle Cell Anemia that influences serum 
bilirubin both at baseline and on hydroxyurea therapy. 

5 (residues 1 to 533) 

Yueh , M . F . , Huang, Y.H., Hiller,A., Chen,S., Nguyen, N. and Tukey , R . H . 
Involvement of the xenobiotic response element (XRE) in Ah 
receptor-mediated induction of human UDP-glucuronosyl trans f erase 
1A1 

J. Biol. Chem. 278 (17), 15001-15006 (2003) 
12566446 

GeneRIF: UGTlAl induction by ligand binding to the Ah receptor was 
regionalized to a UGTlAl enhancer region containing a xenobiotic 
response element (XRE) at -3381/-3299 

6 (residues 1 to 533) 
Basu,N.K., Kole,L. and Owens, I. S. 

Evidence for phosphorylation requirement for human bilirubin 
UDP-glucuronosyltransf erase (UGTlAl) activity 
Biochem. Biophys . Res. Commun. 303 (1), 98-104 (2003) 
12646172 

GeneRIF: human bilirubin UDP-glucuronosyltransf erase requires 
phosphorylation for activity 

7 (residues 1 to 533) 
Bosma,P.J. 

Inherited disorders of bilirubin metabolism 
J. Hepatol. 38 (1), 107-117 (2003) 
12480568 

GeneRIF: Crigler Najar syndrome and Gilbert syndrome caused by 
deficiency in hepatic glucuronidation of bilirubin resulting from 
mutation of UGTqAl gene, (review) 

8 (residues 1 to 533) 
Basten,G.P., Bao , Y . and Williamson, G . 

Sulforaphane and its glutathione conjugate but not sulforaphane 
nitrile induce UDP-glucuronosyl transferase (UGTlAl) and 
glutathione transferase (GSTA1) in cultured cells 
Carcinogenesis 23 (8), 1399-1404 (2002) 
12151360 

GeneRIF: induction in cultured tumor cells by sulforaphane and its 
glutathione conjugate 

9 (residues 1 to 533) 

Huang , C . S . , Chang , P . F . , Huang , M . J . , Chen , E . S . 
Glucose-6-phosphate dehydrogenase deficiency, 
transferase lAl gene, and neonatal hyperbilirubinemia 
Gastroenterology 12 3 (1), 127-133 (2002) 
12105841 

GeneRIF: results indicate that carriage of the homozygous 211 G to 
A variation within the coding region is an additive risk factor for 
neonatal hyperbilirubinemia in G6PD-def icient Taiwanese male 
neonates . 

10 (residues 1 to 533) 

Sugatani,J., Yamakawa,K., Yoshinari , K . , Machida,T., Takagi,H., 
Mori,M., Kakizaki,S., Sueyoshi,T., Negishi,M. and Miwa,M. 
Identification of a defect in the UGTlAl gene promoter and its 
association with hyperbilirubinemia 

Biochem. Biophys. Res. Commun. 292 (2), 492-497 (2002) 
11906189 

GeneRIF: polymorphism in the UGTlAl gene promoter and its 
association with hyperbilirubinemia 

11 (residues 1 to 533) 

Fertrin, K. Y. , Goncalves , M . S . , Saad,S.T. and Costa, F.F. 
Frequencies of UDP-glucuronosyltransf erase 1 (UGTlAl) gene promoter 
polymorphisms among distinct ethnic groups from Brazil 
Am. J. Med. Genet. 108 (2), 117-119 (2002) 



and Chen,W.C. 

the UDP-glucuronosyl 
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NCBI Sequence Viewer 



http://www.ncbi.nlm.nih.gov/entrez/query .fcgi?CMD=Display&DB=p.. 



PUBMED 
REMARK 



REFERENCE 
AUTHORS 

TITLE 



JOURNAL 
PUBMED 
REMARK 

REFERENCE 
AUTHORS 

TITLE 



JOURNAL 
PUBMED 
REMARK 

REFERENCE 
AUTHORS 



TITLE 

JOURNAL 
PUBMED 
REFERENCE 
AUTHORS 
TITLE 

JOURNAL 
PUBMED 
REFERENCE 
AUTHORS 

TITLE 



Chowdhury , J. R. 

cryptic splice 
causing 



JOURNAL 
PUBMED 
REFERENCE 
AUTHORS 
TITLE 

JOURNAL 
PUBMED 
COMMENT 

FEATURES 

source 



11857560 

GeneRIF: The high frequencies of (TA) (7) polymorphism among the 
three groups confirm previous data that this polymorphism is very 
ancient and appears to be distributed throughout the world. 

12 (residues 1 to 533) 

Sappal,B.S., Ghosh, S.S., Shneider,B., Kadakol,A. 
and Chowdhury , N . R . 

A novel intronic mutation results in the use of a 
acceptor site within the coding region of UGTlAl, 
Crigler-Naj jar syndrome type 1 
Mol. Genet. Metab. 75 (2), 134-142 (2002) 
11855932 

GeneRIF: A novel G > A mutation at the splice acceptor site 
intron 4, causing Crigler-Naj jar syndrome type 1 

13 (residues 1 to 533) 

Passon,R.G., Howard, T. A., Zimmerman, S .A . , Schultz,W.H. and 
Ware , R . E . 

Influence of bilirubin uridine diphosphate-glucuronosyltransf erase 
1A promoter polymorphisms on serum bilirubin levels and 
cholelithiasis in children with sickle cell anemia 
J. Pediatr. Hematol . Oncol. 23 (7), 448-451 (2001) 
11878580 

GeneRIF: Polymorphisms in UGT1A1 causes cholelithiasis and a 
modifier of bilirubin metabolism 

14 (residues 1 to 53 3) 

Mackenzie, P. I . , Owens, I. S., Burchell,B., Bock,K.W., Bairoch,A. , 
Belanger,A., Fournel-Gigleux, S . , Green, M. , Hum,D.W. , Iyanagi , T . , 
Lancet, D. , Louisot,P., Magdalou,J., Chowdhury , J. R. , Ritter,J.K., 
Schachter , H. , Tephly,T.R., Tipton, K.F. and Nebert,D.W. 
The UDP glycosyltransf erase gene superfamily: recommended 
nomenclature update based on evolutionary divergence 
Pharmacogenetics 7 (4), 255-269 (1997) 
9295054 

15 (residues 1 to 533) 

Mojarrabi , B . , Butler, R. and Mackenzie, P . I . 

cDNA cloning and characterization of the human UDP 

glucuronosyl transferase, UGT1A3 

Biochem. Biophys . Res. Commun. 225 (3), 785-790 (1996) 
8780690 

16 (residues 1 to 533) 

Ritter,J.K., Chen,F., Sheen, Y.Y., Tran,H.M., Kimura,S., 
Yeatman , M . T . and Owens , I . S . 

A novel complex locus UGTl encodes human bilirubin, phenol, and 
other UDP-glucuronosyltransf erase isozymes with identical carboxyl 
termini 

J. Biol. Chem. 267 (5), 3257-3261 (1992) 
1339448 

17 (residues 1 to 533) 

Ritter,J.K., Crawford, J .M . and Owens, I. S. 

Cloning of two human liver bilirubin UDP-glucuronosyltransf erase 

cDNAs with expression in COS-1 cells 

J. Biol. Chem. 266 (2), 1043-1047 (1991) 

1898728 

This record has not yet been subject to final 

M57899.1. 



REFSEQ: 



PROVISIONAL 

NCBI review. The reference sequence was derived from 

Location/Qualifiers 
1. .533 

/organism= "Homo sapiens" 
/db_xref = " taxon : 9606 " 
/chromosome= " 2 " 
/map="2q37" 
Protein 1. .533 

/product= "UDP glycosyltransf erase 1 family, polypeptide 
Al" 

mat_peptide 1..533 

/gene="UGTlAl" 

/product= "UDP glycosyltransf erase 1 family, polypeptide 
Al" 

/EC_number= " 2 . 4 . 1 . 17 " 



14 of 23 



3/1/04 5:24 PM 



NCBI Sequence Viewer 



http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?CMD=Disp!ay&DB=p.. 



sig_peptide 



Region 



variation 



variation 



CDS 



/ s tandard_name= " bi 1 i rubin UDP-glucuronosy 1 trans f erase 
isozyme 1" 
/note="G00-120-007" 
1. .533 

/gene="UGTlAl" 
/note^'GOO-^O-OO?" 
28. .525 

/region_name="UDP-glucoronosyl and UDP-glucosyl 

transferase" 

/note="UDPGT" 

/ db_xr e f = 11 CDD : 22944 " 

71 

/replace="R" 
/replace="G" 

/ db_xr e f = " dbSNP : 4148323 " 
511 

/replace="P" 
/replace^ "A" 

/db_xref = " dbSNP ; 1042709 " 
1. .533 

/gene= , 'UGTlAl" 

/coded_by="NM_000463.1:16. .1617" 

/note= "go_component : microsome [goid 0005792] [evidence 
IEA] ; 

go_component : integral to membrane [goid 0016021] 
[evidence IEA] ; 

go_f unction: UDP-glucuronosyltransf erase [goid 0003981] 
[evidence E] ; 

go_f unction: glucuronosyl trans f erase activity [goid 
0015020] [evidence IEA]; 

go_process: bilirubin conjugation [goid 0006789] [evidence 
TAS] [pmid 1339448] ; 

go_process: estrogen metabolism [goid 0008210] [evidence 
TAS] [pmid 8780690] ; 

gojprocess: digestion [goid 0007586] [evidence NR] [pmid 
1898728] ; 

go_process: metabolism [goid 0008152] [evidence IEA]" 
/ db_xr e f = " Gene ID : 54658 " 
/ db_xre f =" Locu s ID : 54658 " 
/db_xref="MIM: 191740" 



ORIGIN 



1 mavesqggrp 
61 apdaslyird 
121 amllsgcshl 
181 featqcpnpf 
241 evtvqdllss 
301 giwfslgsm 
3 61 lghpmtrafi 
421 edlenalkav 
481 ltwyqyhsld 



lvlglllcvl 
gafytlktyp 
lhnkelmasl 
syvprplssh 
asvwlf rsdf 
vseipekkam 
thagshgvye 
indksykeni 
vigf llawl 



gpwshagki 
vpf qredvke 
aessf dvmlt 
sdhmtf lqrv 
vkdyprpimp 
aiadalgkip 
sicngvpmvm 
mrlsslhkdr 
tvaf itf kcc 



llipvdgshw 
sfvslghnvf 
dpf lpcspiv 
knmliaf sqn 
nmvfvgginc 
qtvlwrytgt 
mplf gdqmdn 
pvepldlavf 
aygyrkclgk 



lsmlgaiqql 
endsf lqrvi 
aqylslptvf 
f lcdwyspy 
lhqnplsqef 
rpsnlannti 
akrmetkgag 
wvefvmrhkg 
kgrvkkahks 



qqrgheiwl 
ktykkikkds 
f lhalpcsle 
atlasef Iqr 
eayinasgeh 
lvkwlpqndl 
vtlnvlemts 
aphlrpaahd 
kth 



// 



VI: NP_009051 . UDP glycosyltrans...[gi:6005930] 



BLink, Domains, Links 



LOCUS 

DEFINITION 

ACCESSION 

VERSION 

DBSOURCE 

KEYWORDS 

SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 



linear PRI 24-DEC-2003 
polypeptide A4 [Homo sapiens] . 



NP_009051 534 aa 

UDP glycosyltransf erase 1 family, 
NP_009051 

NP_009051.1 GI:6005930 
REFSEQ: accession NM_007120 . 1 

Homo sapiens (human) 
Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 
1 (residues 1 to 534) 

Mackenzie, P. I . , Owens, I. S., Burchell,B., Bock,K.W., Bairoch,A., 
Belanger,A., Fournel-Gigleux, S . , Green, M. , Hum,D.W., Iyanagi,T., 
Lancet ,D., Louisot,P., Magdalou, J. , Chowdhury , J . R . , Ritter,J.K., 
Schachter , H . , Tephly , T . R . , Tipton , K . F . and Neber t , D . W . 



15 of 23 



3/1/04 5:24 PM 



NCBI Sequence Viewer 



http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?CMD=Display&DB=p.. 



TITLE 

JOURNAL 
PUBMED 
REFERENCE 
AUTHORS 

TITLE 

JOURNAL 
PUBMED 
REFERENCE 
AUTHORS 
TITLE 

JOURNAL 
PUBMED 
REFERENCE 
AUTHORS 

TITLE 



The UDP glycosyltransf erase gene superfamily: recommended 
nomenclature update based on evolutionary divergence 
Pharmacogenetics 7 (4), 255-269 (1997) 
9295054 

2 (residues 1 to 534) 

Monaghan,G., Clarke, D. J . , Povey,S., See,C.G., Boxer ,M. and 
Burchell,B. 

Isolation of a human YAC contig encompassing a cluster of UGT2 
genes and its regional localization to chromosome 4ql3 
Genomics 23 (2), 496-499 (1994) 
7835904 

3 (residues 1 to 534) 

Monaghan,G., Povey,S., Burchell,B. and Boxer, M. 
Localization of a bile acid UDP-glucuronosyl trans f erase gene 
(UGT2B) to chromosome 4 using the polymerase chain reaction 
Genomics 13 (3), 908-909 (1992) 
1639428 

4 (residues 1 to 534) 

Ritter,J.K., Chen,F., Sheen, Y.Y., Tran,H.M., Kimura,S., 
Yeatman , M . T . and Owens , I . S . 



A novel complex locus UGT1 encodes human bilirubin, phenol, and 
other UDP-glucuronosyl transferase isozymes with identical carboxyl 
termini 

J. Biol. Chem. 267 (5), 3257-3261 (1992) 
1339448 

5 (residues 1 to 534) 

Ritter,J.K., Crawford, J . M . and Owens, I. S. 

Cloning of two human liver bilirubin UDP-glucuronosyl transferase 
cDNAs with expression in COS-1 cells 
J. Biol. Chem. 266 (2), 1043-1047 (1991) 
1898728 

PROVISIONAL REFSEQ : This record has not yet been subject to final 
NCBI review. The reference sequence was derived from M57951 . 1 . 
Location/Qualifiers 
1. .534 

/organism= "Homo sapiens" 
/db_xref ="taxon: 9606" 
/chromosome^ " 2 " 
/map="2q37" 
Protein 1. .534 

UDP glycosyltransf erase 1 family, polypeptide 



JOURNAL 
PUBMED 
REFERENCE 
AUTHORS 
TITLE 

JOURNAL 
PUBMED 
COMMENT 

FEATURES 

source 



variation 



sig_jpeptide 



mat_peptide 



variation 



Region 



variation 



variation 



/product 
A4 " 
11 

/replace= "R" 
/replace="W" 

/ db_xr e f = " dbSNP : 3892221 " 
12 . .22 

/gene="UGT2B" 
/note="G00-127-753" 
23 . .534 
/gene="UGT2B" 

/product= "UDP glycosyltransf erase 1 family, polypeptide 
A4 " 

/ EC_number = " 2.4.1.17 " 

/note="G00-127-753" 

24 

/replace="P" 
/replace="T" 

/db_xref = " dbSNP : 6755571 " 
29. .526 
/region_name 
transferase " 
/note="UDPGT" 
/ db_xr e f = " CDD : 22944 " 
48 

/replace= 
/replace= 
/db_xref = 
512 



UDP-glucoronosyl and UDP-glucosyl 



"L" 
"V" 

"dbSNP: 2011425" 
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NCBI Sequence Viewer 



http://www.ncbi.nIm.nih.gov/entrez/query.fcgi?CMD=Display&DB=p.. 



/replace="P" 
/replace="A" 

/ db_xr e f = " dbSNP : 1042709 " 
CDS 1..534 

/gene="UGTlA4 " 

/coded_by="NM_007120.1:30. .1634" 

/note= "go_component : microsome [goid 0005792] [evidence 
IEA] ; 

go_component : endoplasmic reticulum [goid 0005783] 
[evidence TAS ] [pmid 1898728]; 

go_component : integral to membrane [goid 0016021] 
[evidence IEA] ; 

go_f unction: glucuronosyltransf erase activity [goid 
0015020] [evidence IEA] ; 

go_process: metabolism [goid 0008152] [evidence IEA]" 
/ db_x r e f = " G e n e I D : 54657 " 
/ db_xr e f = " Locus ID : 54657 " 
/ db_xre f =" MIM : 606429 " 

ORIGIN 

1 marglqvplp rlatglllll svqpwaesgk vlwptdgsp wlsmrealre lharghqaw 
61 ltpevnmhik eekfftltay avpwtqkefd rvtlgytqgf fetehllkry srsmaimnnv 
121 slalhrccve llhnealirh lnatsfdwl tdpvnlcgav lakylsipav ffwryipcdl 
181 dfkgtqcpnp ssyipklltt nsdhmtflqr vknmlyplal syichtfsap yaslaselfq 
241 revswdlvs yasvwlfrgd fvmdyprpim pnmvfiggin cangkplsqe feayinasge 
301 hgiwfslgs mvseipekka maiadalgki pqtvlwrytg trpsnlannt ilvkwlpqnd 
3 61 llghpmtraf ithagshgvy esicngvpmv mmplfgdqmd nakrmetkga gvtlnvlemt 
421 sedlenalka vindksyken imrlsslhkd rpvepldlav fwvefvmrhk gaphlrpaah 
481 dltwyqyhsl dvigfllaw ltvafitfkc caygyrkclg kkgrvkkahk skth 



P8: NP_001063 . UDP glycosyltrans...[gi:4507815] BLink, Domains, Links 



LOCUS 

DEFINITION 

ACCESSION 

VERSION 

DBSOURCE 

KEYWORDS 

SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 



TITLE 



JOURNAL 
PUBMED 
REMARK 



REFERENCE 
AUTHORS 

TITLE 



JOURNAL 
PUBMED 
REMARK 



REFERENCE 
AUTHORS 



NP_001063 531 aa linear PRI 23-DEC-2003 

UDP glycosyl trans f erase 1 family, polypeptide A6 [Homo sapiens] . 
NP_001063 

NP_001063 . 1 GI : 4507 815 
REFSEQ: accession NM_001072 . 1 

Homo sapiens (human) 
Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Primates; Catarrhini ; Hominidae; Homo. 

1 (residues 1 to 531) 

Kohle,C, Mohrle,B., Munzel,P.A., Schwab , M. , Wernet,D. , Badary,O.A. 
and Bock,K.W. 

Frequent co-occurrence of the TATA box mutation associated with 
Gilbert's syndrome (UGTlAl*28) with other polymorphisms of the 
UDP-glucuronosyltransf erase-1 locus (UGT1A6*2 and UGT1A7*3) in 
Caucasians and Egyptians 

Biochem. Pharmacol. 65 (9), 1521-1527 (2003) 
12732365 

GeneRIF: Frequent haplotypes containing several UGT1 allelic 
variants should be taken into account in studies on the association 
between diseases, abnormal drug reactions, and UGT1 family 
polymorphisms . 

2 (residues 1 to 531) 

Antonio, L., Xu,J., Little,J.M., Burchell,B., Magdalou,J. and 
Radominska- Pandya , A . 

Glucuronidation of catechols by human hepatic, gastric, and 

intestinal microsomal UDP-glucuronosyltransf erases (UGT) and 

recombinant UGT1A6, UGT1A9, and UGT2B7 

Arch. Biochem. Biophys. 411 (2), 251-261 (2003) 

12623074 

GeneRIF: These results demonstrate for the first time 
glucuronidation of catechols by gastric and intestinal microsomal 
UGTs and three human recombinant UGT isoforms . Recombinant human 
UGT1A6, 1A9, and 2B7 effectively catalyzed catechol glucuronidation 

3 (residues 1 to 531) 

Peters, W.H., te Morsche,R.H. and Roelofs,H.M. 
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NCBI Sequence Viewer 



http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?CMD=Display&DB=p.. 



TITLE 

JOURNAL 
PUBMED 
REMARK 



REFERENCE 
AUTHORS 

TITLE 

JOURNAL 
PUBMED 
REMARK 
REFERENCE 
AUTHORS 



TITLE 

JOURNAL 
PUBMED 
REFERENCE 
AUTHORS 



TITLE 

JOURNAL 
PUBMED 
REFERENCE 
AUTHORS 

TITLE 



Combined polymorphisms in UDP-glucuronosyltransf erases lAl and 1A6 : 
implications for patients with Gilbert's syndrome 
J. Hepatol. 38 (1), 3-8 (2003) 
12480553 

GeneRIF: Most patients with Gilbert's syndrome may have 
abnormalities in glucuronidation of aspirin or coumarin- and 
dopamine-derivatives , due to this combination of UGT1A1*28 and 
UGT1A6*2 genotypes. 

4 (residues 1 to 531) 

Senay,C, Jedlitschky , G . , Terrier, N. , Burchell,B., Magdalou,J. and 
Fournel-Gigleux, S . 

The importance of cysteine 12 6 in the human liver 

UDP-glucuronosyltransf erase UGT1A6 

Biochim. Biophys . Acta 1597 (1), 90-96 (2002) 

12009407 

GeneRIF: relevance of cysteine 12 6 in the glucuronidation process 

5 (residues 1 to 531) 

Gong,Q.H., Cho, J.W. , Huang, T. , Potter, C, Gholami,N., Basu,N.K., 
Kubota,S., Carvalho,S., Pennington, M.W. , Owens , I . S . and 
Popescu, N. C . 

Thirteen UDPglucuronosyl transferase genes are encoded at the human 
UGT1 gene complex locus 

Pharmacogenetics 11 (4), 357-368 (2001) 
11434514 

1 to 531) 
Owens , I . 



JOURNAL 
PUBMED 
REFERENCE 
AUTHORS 
TITLE 

JOURNAL 
PUBMED 
REFERENCE 
AUTHORS 
TITLE 

JOURNAL 
PUBMED 
COMMENT 

FEATURES 

source 



Bairoch, A. , 
, Iyanagi , T . , 
Ritter, J.K. , 



6 (residues 

Mackenzie, P. I . , Owens, I. S., Burchell,B., Bock,K.W., 
Belanger,A. , Fournel-Gigleux, S . , Green, M., Hum,D.W. 
Lancet, D., Louisot,P., Magdalou,J., Chowdhury , J . R . , 
Schachter , H. , Tephly,T.R., Tipton, K.F. and Nebert , D . W . 
The UDP glycosyl transferase gene super family: recommended 
nomenclature update based on evolutionary divergence 
Pharmacogenetics 7 (4), 255-269 (1997) 
9295054 

7 (residues 1 to 531) 

Ritter, J. K., Chen,F., Sheen,Y.Y., Tran,H.M., Kimura,S., 
Yeatman,M.T. and Owens, I. S. 

A novel complex locus UGTl encodes human bilirubin, phenol, and 
other UDP-glucuronosyltransf erase isozymes with identical carboxyl 
termini 
J. Biol. 
1339448 



Chem. 267 (5), 3257-3261 (1992) 



J . , Povey , S . and Burchel 1 , B . 

a human phenol UDP-glucuronosyltransf erase , 



Protein 



Region 



8 (residues 1 to 531) 
Harding, D. , Jeremiah, S 
Chromosomal mapping of 
GNTl 

Ann. Hum. Genet. 54 (Pt 1), 17-21 (1990) 
2108603 

9 (residues 1 to 531) 

Harding, D., Fournel-Gigleux, S . , Jackson, M.R. and Burchell,B. 
Cloning and substrate specificity of a human phenol 
UDP-glucuronosyltransf erase expressed in COS-7 cells 
Proc. Natl. Acad. Sci. U.S.A. 85 (22), 8381-8385 (1988) 
3141926 

PROVISIONAL REFSEQ : This record has not yet been subject to final 
NCBI review. The reference sequence was derived from JQ4093 . 1 . 
Location/Qualifiers 
1. .531 

/organism= "Homo sapiens" 
/db_xref = " taxon : 9606 " 
/ chromosome^ " 2 " 
/map="2q37" 
1. .531 

/product= "UDP glycosyltransf erase 1 family, polypeptide 
A6" 

27. .523 

/region_name= "UDP-glucoronosyl and UDP-glucosyl 
transferase" 
/note="UDPGT" 
/db_xref= n CDD: 22944" 
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NCBI Sequence Viewer 



http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?CMD=Display&DB=p.. 



CDS 



ORIGIN 



1. .531 

/gene= ,, UGTlA6" 

/coded_by= "NM_001072 .1:88.. 1683 " 

/note= "go_component : microsome [goid 0005792] [evidence 
NAS ] ; 

go_component : integral to membrane [goid 0016021] 
[evidence IEA] ; 

go_f unction: glucuronosyltransf erase activity [goid 
0015020] [evidence IEA] ; 

go_process: xenobiotic metabolism [goid 0006805] [evidence 
IDA] [pmid 3141926] " 
/db_xref = "Gene ID : 54578 " 
/db_xref=" LocusID: 54578 " 
/ db_xr ef = " MIM : 606431 " 



1 macllrsfqr 

61 pevnlllkey 

121 lyfincqsll 

181 tfsrspdpvs 

241 vdiitlsevs 

301 wfslgsmvs 

361 hpmtrafith 

421 lenalkavin 

481 wyqyhsldvi 



isagvf f lal 
kyytrkiypv 
qdrdtlnf fk 
yiprcytkf s 
vwllrydf vl 
eipekkamai 
agshgvyesi 
dksykenimr 
gf llawltv 



wgmwgdkll 
pydqeelknr 
eskfdalf td 
dhmtf sqrva 
eyprpvmpnm 
adalgknpqt 
c ngvpmvmmp 
lsslhkdrpv 
af itf kccpy 



wpqdgshwl 
yqsf gnnhf a 
palpcgvila 
nf lvnllepy 
vf igginckk 
vlwrytgtrp 
If gdqmdnak 
epldlavfwv 
gypkclgkkg 



smkdivevls 
ersf ltapqt 
eylglpsvyl 
If yclf skye 
rkdlsqef ea 
snlanntilv 
rmetkgagvt 
efvmrhkgap 
rvkkahkskt 



drgheivvw 
eyrnnmivig 
f rgfpcsleh 
klasavlkrd 
yinasgehgi 
kwlpqndllg 
Invlemtsed 
hlrpaahdlt 
h 



// 



T9: NP_061950 . UDP glycosyltrans...[gi:41282213] 



BLink, Links 



LOCUS 

DEFINITION 

ACCESSION 

VERSION 

DBSOURCE 

KEYWORDS 

SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 

JOURNAL 
PUBMED 
REFERENCE 
AUTHORS 

TITLE 



JOURNAL 
PUBMED 
REMARK 



REFERENCE 
AUTHORS 



TITLE 

JOURNAL 
PUBMED 
REFERENCE 
AUTHORS 



NP_061950 530 aa linear 

UDP glycosyltransf erase 1 family, polypeptide A7 ; 
UDP-glucuronosyltransf erase 1A7 [Homo sapiens] . 
NP_061950 

NP„061950.2 GI: 41282213 
REFSEQ : accession NM_019077 . 2 



PRI 25-JAN-2004 



Craniata; Vertebrata; Euteleostomi ; 
Catarrhini ; Hominidae ; Homo . 



A Mechanism for Cell- and 



Schwab, M., Wernet,D., Badary,O.A. 



Homo sapiens (human) 
Homo sapiens 

Eukaryota; Metazoa; Chordata; 
Mammalia; Eutheria; Primates; 

1 (residues 1 to 53 0) 
Zhang , T . , Haws , P . and Wu , Q . 
Multiple Variable First Exons 
Tissue-Specific Gene Regulation 
Genome Res. 14 (1), 79-89 (2004) 
14672974 

2 (residues 1 to 530) 
Kohle,C, Mohrle,B., Munzel,P.A. 
and Bock,K.W. 

Frequent co-occurrence of the TATA box mutation associated with 
Gilbert's syndrome (UGT1A1*28) with other polymorphisms of the 
UDP-glucuronosyltransf erase-1 locus (UGTlA6*2 and UGTlA7*3) in 
Caucasians and Egyptians 

Biochem. Pharmacol. 65 (9), 1521-1527 (2003) 
12732365 

GeneRIF: Frequent haplotypes containing several UGTl allelic 
variants should be taken into account in studies on the association 
between diseases, abnormal drug reactions, and UGTl family 
polymorphisms . 

3 (residues 1 to 530) 

Gong , Q . H . , Cho,J.W., Huang, T., Potter, C, Gholami,N., Basu,N.K., 
Kubota,S., Carvalho,S., Pennington, M.W. , Owens, I. S. and 
Popescu,N.C. 

Thirteen UDPglucuronosyl trans f erase genes are encoded at the human 
UGTl gene complex locus 

357-368 (2001) 



Pharmacogenetics 11 (4) 
11434514 

4 (residues 1 to 530) 

Mackenzie, P. I . , Owens, I. S., Burchell,B., Bock,K.W., 
Belanger,A., Fournel-Gigleux, S . , Green, M. , Hum,D.W. 



Bairoch, A . , 
, Iyanagi , T . 
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NCBI Sequence Viewer 



http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?CMD=Display&DB=p.. 



TITLE 

JOURNAL 
PUBMED 
REFERENCE 
AUTHORS 
TITLE 



JOURNAL 
PUBMED 
COMMENT 



FEATURES 

source 



Protein 



Region 



Lancet, D., Louisot,P., Magdalou,J., Chowdhury , J . R . , Ritter,J.K., 
Schachter,H. , Tephly,T.R., Tipton, K.F. and Nebert,D.W. 
The UDP glycosyltransf erase gene superf amily : recommended 
nomenclature update based on evolutionary divergence 
Pharmacogenetics 7 (4), 255-269 (1997) 
9295054 

5 (residues 1 to 530) 

Strassburg,C . P . , 01dhafer,K., Manns, M. P. and Tukey , R . H . 
Differential expression of the UGT1A locus in human liver, biliary, 
and gastric tissue: identification of UGT1A7 and UGT1A10 
transcripts in extrahepatic tissue 
Mol. Pharmacol. 52 (2), 212-220 (1997) 
9271343 

PROVISIONAL REFSEQ : This record has not yet been subject to final 
NCBI review. The reference sequence was derived from AY435142 . 1 . 
On Jan 25, 2004 this sequence version replaced gi : 29789084 . 
Location/Qualifiers 
1. .530 

/organism- "Homo sapiens" 
/db_xref = " taxon : 960 6 " 
/chromosome^ " 2 " 
/map="2q37" 
1. .530 

/product= "UDP glycosyltransf erase 1 family, polypeptide 
A7 " 

/note= "UDP-glucuronosyltransf erase 1A7 " 
22. .506 



Region 



CDS 



/region_name= "UDP-glucuronosyl and UDP-glucosyl 

transferase [Carbohydrate transport and metabolism, 

production and conversion] " 

/note= M KOG1192 n 

/ db_xr ef = " CDD : 18981 " 

26. .522 

/region_name= "UDP-glucoronosyl and UDP-glucosyl 

transferase " 

/note="UDPGT" 

/db_xref = " CDD : 24386 " 

1. .530 

/gene="UGTlA7 " 
/coded_by= "NM_019077 , 
/note= "go_component : 
[evidence IEA] ; 



Energy 



2:1. .1593" 
extracellular 



[goid 0005576] 



[goid 0003981] 



go_f unction : UDP-glucuronosyltransf erase 
[evidence NR] ; 

go_f unction: calcium ion binding [goid 0005509] [evidence 
IEA] ; 

go_f unction: transferase activity [goid 0016740] [evidence 
IEA] ; 

go_f unction: transferase activity, transferring hexosyl 
groups [goid 0016758] [evidence IEA] ; 

go_process: metabolism [goid 0008152] [evidence IEA]" 
/db_xref ="GeneID: 54577" 
/db_xref = 
/db_xref = 



" LocusID: 54577 ' 
"MIM: 606432 " 



ORIGIN 



// 



1 maragwtgll plyvcllltc gfakagkllv vpmdgshwft mqsweklil rghewwmp 

61 evswqlgrsl nctvktysts ytledqdref mvfadarwta plrsafsllt sssngifdlf 

121 fsncrslfnd rklveylkes cfdavfldpf dacglivaky fslpswfar gifchyleeg 

181 aqcpaplsyv prlllgfsda mtfkervwnh imhleehlfc pyffknvlei aseilqtpvt 

241 aydlyshtsi wllrtdfvle ypkpvmpnmi figginchqg kpvpmefeay inasgehgiv 

301 vfslgsmvse ipekkamaia dalgkipqtv lwrytgtrps nlanntilvk wlpqndllgh 

361 pmtrafitha gshgvyesic ngvpmvmmpl fgdqmdnakr metkgagvtl nvlemtsedl 

421 enalkavind ksykenimrl sslhkdrpve pldlavfwve fvmrhkgaph Irpaahdltw 

481 yqyhsldvig fllawltva fitfkccayg yrkclgkkgr vkkahkskth 



r 10: P22310 . UDP-glucuronosylt...[gi:136731] 

LOCUS P22310 534 aa 
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DEFINITION 



ACCESSION 

VERSION 

DBSOURCE 



KEYWORDS 



SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 

JOURNAL 
MEDLINE 
PUBMED 
REMARK 

REFERENCE 
AUTHORS 

TITLE 



JOURNAL 
MEDLINE 
PUBMED 
REMARK 
REFERENCE 
AUTHORS 



TITLE 

JOURNAL 
MEDLINE 
PUBMED 
REMARK 
REFERENCE 
AUTHORS 



TITLE 



JOURNAL 
MEDLINE 
PUBMED 
REMARK 
REFERENCE 
AUTHORS 

TITLE 



UDP-glucuroriosyl trans f erase 1-4 precursor, microsomal 
(UDP-glucuronosyltransf erase 1A4) (UDPGT) (UGT1*4) (UGT1-04) 
(UGT1.4) (UGT-1D) (UGT1D) (Bilirubin specific UDPGT isozyme 2) 
(HUG-BR2) . 
P22310 

P22310 GI:136731 

swissprot: locus UD14_HUMAN, accession P22310; 
class: standard, 
created: Aug 1, 1991. 
sequence updated: Aug 1, 1991. 
annotation updated: Sep 15, 2003. 
xrefs: gi: 340136 , gi : 340137 , gi : 
gi: 340128 , gi : 184474 , gi : 184475 



340129 , gi: 459838 , gi : 340127 , 
gi: 11118740 , gi : 11118747 



xrefs (non- sequence databases): GenewHGNC : 12536, MIM 606429 , MIM 
191740 , MIM 143500 , MIM 218800 , GO0005783, InterProIPR002213 , 
PfamPF00201, PROSITEPS00375 

Transferase; Glycosyl transferase; Glycoprotein; Transmembrane; 
Signal; Multigene family; Microsome; Alternative splicing; Disease 
mutation . 

Homo sapiens (human) 
Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Primates; Catarrhini ; Hominidae; Homo. 

1 (residues 1 to 534) 

Ritter,J.K., Crawford, J .M . and Owens , I . S . 

Cloning of two human liver bilirubin UDP-glucuronosyltransf erase 

cDNAs with expression in COS-1 cells 

J. Biol. Chem. 266 (2), 1043-1047 (1991) 

91093210 

1898728 

SEQUENCE FROM N.A. 
TISSUE=Liver 

2 (residues 1 to 534) 

Ritter,J.K., Chen,F., Sheen, Y.Y., Tran,H.M., Kimura,S., 
Yeatman , M . T . and Owens , I . S . 

A novel complex locus UGT1 encodes human bilirubin, phenol, and 
other UDP-glucuronosyltransf erase isozymes with identical carboxyl 
termini 

J. Biol. Chem. 267 (5), 3257-3261 (1992) 

92147680 

1339448 

SEQUENCE FROM N.A., AND TISSUE SPECIFICITY. 

3 (residues 1 to 534) 

Gong, Q . H . , Cho,J.W., Huang, T., Potter,C, Gholami,N. 
Kubota,S., Carvalho,S., Pennington, M.W. , Owens, I. S. 
Popescu, N . C . 

Thirteen UDPglucuronosyl transferase genes are encoded at the human 
UGT1 gene complex locus 

Pharmacogenetics 11 (4), 357-368 (2001) 

21327373 

11434514 

SEQUENCE FROM N.A. 

4 (residues 1 to 534) 
Bosma,P.J., Chowdhury , J . R 
Van Es,H.H., Lederstein,M 
Chowdhury , N . R . 

Mechanisms of inherited deficiencies of multiple 
UDP-glucuronosyltransf erase isoforms in two patients with 
Crigler-Naj jar syndrome, type I 
FASEB J. 6 (10), 2859-2863 (1992) 
92339803 
1634050 

VARIANT CN-I PHE-37 6. 

5 (residues 1 to 534) 
Aono , S . , Yamada , Y 
Yazawa , T . , Sato , H 



Basu, N . K . 
and 



Huang, T.J. , Lahiri,P. , Elf erink, R. P . 
Whitington, P . F . , Jansen,P.L. and 



Nakagawa , T . , Sasaoka , Y . , 



Keino , H . , Hanada , N . 
and Koiwai,0. 

Identification of defect in the genes for bilirubin 
UDP-glucuronosyl -transferase in a patient with Crigler-Naj jar 
syndrome type II 
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JOURNAL Biochem. Biophys . Res. Commun. 197 (3), 1239-1244 (1993) 

MEDLINE 94107323 

PUBMED 8280139 

REMARK VARIANTS CN-II PRO-132 AND ASP-487 . 

REFERENCE 6 (residues 1 to 534) 

AUTHORS Moghrabi,N., Clarke, D. J., Boxer, M. and Burchell,B. 

TITLE Identification of an A-to-G missense mutation in exon 2 
gene complex that causes Crigler-Naj jar syndrome type 2 

JOURNAL Genomics 18 (1), 171-173 (1993) 

MEDLINE 94102756 

PUBMED 8276413 

REMARK VARIANT CN-II ARG-332. 

COMMENT 



of the UGT1 



This SWISS-PROT entry is copyright. It is produced through a 
collaboration between the Swiss Institute of Bioinf ormatics and 
the EMBL outstation - the European Bioinf ormatics Institute. 
The original entry is available from http : / /www . expasy . ch/ sprot 
and ht tp : / /www . ebi . ac . uk/ sprot 

[FUNCTION] UDPGT is of major importance in the conjugation and 
subsequent elimination of potentially toxic xenobiotics and 
endogenous compounds. This isoform glucuronidates bilirubin 
IX-alpha to form both the IX-alpha-C8 and IX-alpha-C12 
monoconjugates and diconjugate. 

[CATALYTIC ACTIVITY] UDP-glucuronate + acceptor = UDP + acceptor 
beta-D-glucuronoside . 
[SUBCELLULAR LOCATION] Microsomal. 

[ALTERNATIVE PRODUCTS] Event=Alternative splicing; Named 
isoforms=l; Comment=A number of isoforms are produced. The 
different isozymes have a different N- terminal domain and a common 
C-terminal domain of 245 residues; Name=l; IsoId=P22310-l; 
Sequence=Displayed . 

[TISSUE SPECIFICITY] Expressed in liver. Not expressed in skin or 
kidney . 

[INDUCTION] By phenobarbital . 

[DISEASE] THE GILBERT'S SYNDROME IS SHOWN TO OCCUR AS A CONSEQUENCE 
OF REDUCED BILIRUBIN TRANSFERASE ACTIVITY. THE DISORDER, IS MOST 
OFTEN DETECTED IN YOUNG ADULTS WITH VAGUE NONSPECIFIC COMPLAINTS. A 
MORE SEVERE INHERITABLE DEFICIENCY IN BILIRUBIN ACTIVITY EXIST IN 
CRIGLER-NAJJAR (CN) : PATIENTS WITH TYPE I (RECESSIVE TRAIT) HAVE 
SEVERE HYPERBILIRUBINEMIA AND USUALLY DIE OF KERNICTERUS (BILIRUBIN 
ACCUMULATION IN THE BASAL GANGLIA AND BRAINSTEM NUCLEI) WITHIN THE 
FIRST YEAR OF LIFE. PATIENTS WITH TYPE II (DOMINANT TRAIT) HAVE 
LESS SEVERE HYPERBILIRUBINEMIA AND USUALLY SURVIVE INTO ADULTHOOD 
WITHOUT NEUROLOGIC DAMAGE. PHENOBARBITAL, WHICH INDUCES THE 
PARTIALLY DEFICIENT GLUCURONYL TRANSFERASE, CAN DIMINISH THE 
JAUNDICE . 

[SIMILARITY] Belongs to the UDP-glycosyl transferase family. 
Location/Qualifiers 
1. .534 

/organism^ " Homo sapiens" 
/db_xref=" taxon: 9606" 
1. .534 
/gene="UGTlA4 " 

/note= " synonyms : UGT1, GNT1" 
Protein 1. .534 

/gene="UGTlA4 " 

/product= "UDP-glucuronosyltransf erase 1-4 precursor, 

microsomal " 

/ EC_number= " 2.4.1.17 " 
Region 1 . . 28 

/gene="UGTlA4 " 

/region_name= "Signal " 

/note=" POTENTIAL. " 
Region 29 . . 534 

/gene="UGTlA4 " 

/region_name= "Mature chain" 

/note= "UDP-GLUCURONOSYLTRANSFERASE 1-4 . " 
Site 119 



FEATURES 

source 



gene 



NCBI Sequence Viewer 



http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?CMD==Display&DB=p.. 



Region 



Site 



Site 



Region 



Site 



Region 



Region 



Region 



/gene="UGTlA4 " 
/site_type= 11 glycosylation " 

/note= "N-LINKED (GLCNAC. . . ) (POTENTIAL) . " 
132 

/gene="UGTlA4 " 
/region_name= "Variant " 

/note="L -> P (IN CRIGLER- NAJJAR TYPE II) 

/FTId=VAR_009506. " 

142 

/gene="UGTlA4 " 
/site_type= "glycosylation" 

/note= "N-LINKED (GLCNAC. . . ) (POTENTIAL) . " 
296 

/gene="UGTlA4 " 
/site_type= "glycosylation" 

/note= "N-LINKED (GLCNAC . . . ) ( POTENTIAL) . " 
332 

/gene="UGTlA4 " 

/ region_name= "Variant " 

/note="Q -> R (IN CRIGLER-NAJJAR TYPE II) 

/FTId=VAR_007710 . " 

348 

/gene="UGTlA4 " 
/site_type= "glycosylation" 

/note= "N-LINKED (GLCNAC. . . ) (POTENTIAL) . " 
376 

/gene="UGTlA4 " 
/region_name= "Variant " 

/note="S -> F (IN CRIGLER-NAJJAR TYPE I). 

/FTId=VAR_007711 . " 

487 

/gene="UGTlA4 " 
/region_name= "Variant " 

/note="Y -> D (IN CRIGLER-NAJJAR TYPE II) 
/FTId=VAR_009507 . " 
492. .508 
/gene="UGTlA4 " 

/region_name= "Transmembrane region" 
/note= " POTENTIAL . " 



ORIGIN 



// 



1 marglqvplp 
61 ltpevnmhik 
121 slalhrccve 
181 dfkgtqcpnp 
241 revswdlvs 
301 hgiwfslgs 
361 llghpmtraf 
421 sedlenalka 
481 dltwyqyhsl 



rlatglllll 
eekf ftltay 
llhnealirh 
ssyipklltt 
yasvwlf rgd 
mvseipekka 
ithagshgvy 
vindksyken 
dvigf llaw 



svqpwaesgk 
avpwtqkef d 
lnatsfdwl 
nsdhmtf Iqr 
f vmdyprpim 
maiadalgki 
esicngvpmv 
imrlsslhkd 
ltvaf itfkc 



vlwptdgsp 
rvtlgytqgf 
tdpvnlcgav 
vknmlyplal 
pnmvf iggin 
pqtvlwrytg 
mmplf gdqmd 
rpvepldlav 
caygyrkclg 



wlsmrealre 
f etehllkry 
lakylsipav 
syichtf sap 
cangkplsqe 
trpsnlannt 
nakrmetkga 
f wve f vmrhk 
kkgrvkkahk 



lharghqaw 
srsmaimnnv 
f fwryipcdl 
yaslaself q 
f eayinasge 
i 1 vkwlpqnd 
gvtlnvlemt 
gaphlrpaah 
skth 
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